Hi Theo,

that's because you omitted the sanitization step completely, so the molecule is missing crucial information for the SubstructureMatch to do a proper job.

If you put back sanitization, only leaving out the aromatization step, things work as expected. Also, you do not need to create pattern again from SMILES, you can make a copy of the molecule that you have already created and sanitized using the Chem.Mol copy constructor.

from rdkit import Chem

smiles_strings = '''
N12N3C(CC4=CC=CC(NC=C2)=C14)=CC=C3
C12=CC=CC3=C1N(N4C=CC=C4C2)C=CN3
'''

smiles_list = smiles_strings.splitlines()[1:]
print(smiles_list)

params = Chem.SmilesParserParams()
params.sanitize=False

mols = [Chem.MolFromSmiles(x,params) for x in smiles_list]
for m in mols:
    Chem.SanitizeMol(m, Chem.SANITIZE_ALL ^ Chem.SANITIZE_SETAROMATICITY)

pattern = Chem.Mol(mols[0])

query_params = Chem.AdjustQueryParameters()
query_params.makeBondsGeneric = True
query_params.aromatizeIfPossible = False
query_params.adjustDegree = False
query_params.adjustHeavyDegree = False
pattern_generic_bonds = Chem.AdjustQueryProperties(pattern,query_params)

matches = [idx for idx,m in enumerate(mols) if m.HasSubstructMatch(pattern_generic_bonds)]
print("{} of {}: {}".format(len(matches),len(smiles_list),matches))

$ python3 SubstructMatch2.py

['N12N3C(CC4=CC=CC(NC=C2)=C14)=CC=C3', 'C12=CC=CC3=C1N(N4C=CC=C4C2)C=CN3']
2 of 2: [0, 1]

Cheers,
p.

On 20/05/2020 09:50, theozh wrote:
from rdkit import Chem

smiles_strings = '''
N12N3C(CC4=CC=CC(NC=C2)=C14)=CC=C3
C12=CC=CC3=C1N(N4C=CC=C4C2)C=CN3
'''

smiles_list = smiles_strings.splitlines()[1:]
print(smiles_list)

params = Chem.SmilesParserParams()
params.sanitize=False

mols = [Chem.MolFromSmiles(x,params) for x in smiles_list]

pattern = Chem.MolFromSmiles(smiles_list[0],params)

query_params = Chem.AdjustQueryParameters()
query_params.makeBondsGeneric = True
query_params.aromatizeIfPossible = False
query_params.adjustDegree = False
query_params.adjustHeavyDegree = False
pattern_generic_bonds = Chem.AdjustQueryProperties(pattern,query_params)

matches = [idx for idx,m in enumerate(mols) if 
m.HasSubstructMatch(pattern_generic_bonds)]
print("{} of {}: {}".format(len(matches),len(smiles_list),matches))


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to