Re: [Rdkit-discuss] Incorrect results for substructure search obtained with Tversky similarity.

2016-12-12 Thread Greg Landrum
Hi Axel, The RDKit's Morgan Fingerprint is not a substructure screening fingerprint. If you want to use a fingerprint for screening, your best bet is the Pattern fingerprint. As an aside, the RDKit has a function, DataStructs.AllProbeBitsMatch (http://www.rdkit.org/Python_Docs/rdkit.DataStructs.

Re: [Rdkit-discuss] Incorrect results for substructure search obtained with Tversky similarity.

2016-12-12 Thread Axel Rudling
Hi Brian and thank you for your respons. Yes, so Tversky with alpha parameter set to 1.0 and a cutoff for the similarity at 1.0 (100 % of me in you) will equal substucture search, at least at a theoretical level. I guess my question is, does imperfections in the fp model likley to generate these ki

Re: [Rdkit-discuss] Incorrect results for substructure search obtained with Tversky similarity.

2016-12-12 Thread Brian Kelley
I'm not really sure what you mean by tversky searching in substructure mode. Fingerprinting methods do not guarantee the presence of an exact substructure. You can think of tversky asking what percentage of me is in you and that percentage doesn't have to be a substructure. However they are co

[Rdkit-discuss] Incorrect results for substructure search obtained with Tversky similarity.

2016-12-12 Thread Axel Rudling
Hello all, Currently I'm doing a project with Tversky searching in substructure mode and use smiles for creating fingerprints. For most molecules I get the correct result but there are some molecules where I get an overflow of falsely predicted substructure molecules. In brief, I get a large amou