Hi all, sorry I realise you're probably all busy down in Chemistry with the
meeting to give a quick answer to a problem I'm having with
GetSubstructMatch.  Basically I need to get the mapping between the
reactants and the products in the form of the atom indices in the product
Smiles in the order of the product Smarts atoms.  Here's my test script
(running rdkit-Release_2018_03_4 on Kubuntu 16.04):

from rdkit import Chem
from rdkit.Chem import AllChem

reactantSmarts = '[O:1]=[C:2][C:3]Cl.[S:4][C:5][C:6]'
productSmarts = '[O:1]=[C:2][C:3][S:4][C:5][C:6]'
reactionSmirks = reactantSmarts + '>>' + productSmarts

# Chloroacetaldehyde & cysteine sidechain:
ligandSmiles = 'O=CCCl'
targetSmiles = 'SCC'

print('\nreactantSmarts:', reactantSmarts)
print('productSmarts: ', productSmarts)
print('reactionSmirks:', reactionSmirks)
rxn = AllChem.ReactionFromSmarts(reactionSmirks)

print('\nligandSmiles:  ',ligandSmiles)
print('targetSmiles:  ',targetSmiles)
ligand = Chem.MolFromSmiles(ligandSmiles)
target = Chem.MolFromSmiles(targetSmiles)

ps = rxn.RunReactants((ligand, target))
print('productSmiles: ', Chem.MolToSmiles(ps[0][0]))

productPattern = Chem.MolFromSmarts(productSmarts)
print('\nSmiles from productSmarts:', Chem.MolToSmiles(productPattern))

print("\nAtom indices in productSmiles ordered as productSmarts’ atoms:",
ps[0][0].GetSubstructMatch(productPattern))

and this is the output:

reactantSmarts: [O:1]=[C:2][C:3]Cl.[S:4][C:5][C:6]
productSmarts:  [O:1]=[C:2][C:3][S:4][C:5][C:6]
reactionSmirks:
[O:1]=[C:2][C:3]Cl.[S:4][C:5][C:6]>>[O:1]=[C:2][C:3][S:4][C:5][C:6]

ligandSmiles:   O=CCCl
targetSmiles:   SCC
productSmiles:  CCSCC=O

Smiles from productSmarts: [O:1]=[CH:2][CH2:3][S:4][CH2:5][CH3:6]

Atom indices in productSmiles ordered as productSmarts’ atoms: (0, 1, 2, 3,
4, 5)

Notice how the order of atoms in productSmiles has been reversed from
productSmarts (presumably some internal canonicalisation?).  Nothing wrong
in that per se, but this reversal is not reflected in the indices returned
from GetSubstructMatch so my attempt to match up the atoms in the two
Smiles strings crashes & burns.  Isn't the correct answer here (5, 4, 3, 2,
1, 0)  or am I totally off-beam?

Cheers

-- Ian J. Tickle
Global Phasing Ltd., Cambridge, UK.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to