Alexis,

I believe that `DataStructs.AllProbeBitsMatch(query_fp,mol_fp)` is the
function you are looking for here. More advanced usage and code snippets
you can find on RDKit blog post that Greg has put together here:
https://rdkit.blogspot.com/2013/11/fingerprint-based-substructure.html

Best,
Maciek

----
Pozdrawiam,  |  Best regards,
Maciek Wójcikowski
mac...@wojcikowski.pl


pon., 10 lut 2020 o 16:10 Alexis Parenty <alexis.parenty.h...@gmail.com>
napisał(a):

> Dear Rdkiters,
>
> I am interested in doing substructure searches between many thousands
> structures and many thousands of fragments, as quickly as possible, with
> reasonable accuracy (> 0.95)...
>
> I did read Greg's excellent post on that subject:
>
>
> http://rdkit.blogspot.com/2019/07/a-couple-of-substructure-search-topics.html
>
> I was using the rdkit pattern fingerprint approach to filter out any
> fragments that have no chance of matching the bigger structure through the
> slow and more accurate molecular graph approach, saving a lot of time.
>
> However, I realized that this rdkit pattern fingerprint approach only
> works well if we compared smiles with smiles:
>
>
>
> def frag_is_a_substructure_of_structure_via_pfp(frag*, *smiles):
>     pfp_frag = Chem.PatternFingerprint(Chem.MolFromSmiles(frag))
>     pfp_structure = Chem.PatternFingerprint(Chem.MolFromSmiles(smiles))
>
>     frag_bits = set(pfp_frag.GetOnBits())
>     structure_bits = set(pfp_structure.GetOnBits())
>
>     if frag_bits.issubset(structure_bits):
>         return True
>     else:
>         return False
>
>
>
> Unfortunately, some of my fragments are Smarts that are not valid Smiles:
> Using Chem.MolFromSmarts(smarts) gives really poor result (Many False
> Positives leading to poor Specificity). Interestingly, there is no False
> Negative, leading to a Sensitivity of 1!
>
>
>
> def frag_is_a_substructure_of_structure_via_pfp(frag*, *smiles):
>     pfp_frag = Chem.PatternFingerprint(Chem.MolFromSmarts(frag))
>     pfp_structure = Chem.PatternFingerprint(Chem.MolFromSmiles(smiles))
>
>     frag_bits = set(pfp_frag.GetOnBits())
>     structure_bits = set(pfp_structure.GetOnBits())
>
>     if frag_bits.issubset(structure_bits):
>         return True
>     else:
>         return False
>
>
>
> Is there a way to use pattern fingerprint (or other method) for
> substructure searches independently of the Smiles/Smarts format of the
> fragments? If not, is mol_struct.HasSubstructMatch(mol_frag) the only way I
> am left with?
>
> Many thanks,
>
> Alexis
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to