Dear Ian,
print(Chem.MolToMolBlock(ps[0][0]))
produces:
RDKit 2D
6 5 0 0 0 0 0 0 0 0999 V2000
6.4952 0.7500 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
5.1962 -0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
3.8971 0.7500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
2.5981 -0.0000 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
1.2990 0.7500 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1 2 2 0
2 3 1 0
3 4 1 0
4 5 1 0
5 6 1 0
M END
so GetSubstructMatch returns the indices in the order provided by the query.I
think, the re-ordering happens when the molecule is put into a smiles string.
Best wishes,
Maria
On Tuesday, 18 September 2018, 14:40:05 CEST, Ian Tickle
<[email protected]> wrote:
Hi all, sorry I realise you're probably all busy down in Chemistry with the
meeting to give a quick answer to a problem I'm having with GetSubstructMatch.
Basically I need to get the mapping between the reactants and the products in
the form of the atom indices in the product Smiles in the order of the product
Smarts atoms. Here's my test script (running rdkit-Release_2018_03_4 on
Kubuntu 16.04):
from rdkit import Chemfrom rdkit.Chem import AllChem
reactantSmarts = '[O:1]=[C:2][C:3]Cl.[S:4][C:5][C:6]'productSmarts =
'[O:1]=[C:2][C:3][S:4][C:5][C:6]'reactionSmirks = reactantSmarts + '>>' +
productSmarts
# Chloroacetaldehyde & cysteine sidechain:ligandSmiles = 'O=CCCl'targetSmiles =
'SCC'
print('\nreactantSmarts:', reactantSmarts)print('productSmarts: ',
productSmarts)print('reactionSmirks:', reactionSmirks)rxn =
AllChem.ReactionFromSmarts(reactionSmirks)
print('\nligandSmiles: ',ligandSmiles)print('targetSmiles:
',targetSmiles)ligand = Chem.MolFromSmiles(ligandSmiles)target =
Chem.MolFromSmiles(targetSmiles)
ps = rxn.RunReactants((ligand, target))print('productSmiles: ',
Chem.MolToSmiles(ps[0][0]))
productPattern = Chem.MolFromSmarts(productSmarts)print('\nSmiles from
productSmarts:', Chem.MolToSmiles(productPattern))
print("\nAtom indices in productSmiles ordered as productSmarts’ atoms:",
ps[0][0].GetSubstructMatch(productPattern))
and this is the output:
reactantSmarts: [O:1]=[C:2][C:3]Cl.[S:4][C:5][C:6]
productSmarts: [O:1]=[C:2][C:3][S:4][C:5][C:6]
reactionSmirks:
[O:1]=[C:2][C:3]Cl.[S:4][C:5][C:6]>>[O:1]=[C:2][C:3][S:4][C:5][C:6]
ligandSmiles: O=CCCl
targetSmiles: SCC
productSmiles: CCSCC=O
Smiles from productSmarts: [O:1]=[CH:2][CH2:3][S:4][CH2:5][CH3:6]
Atom indices in productSmiles ordered as productSmarts’ atoms:(0, 1, 2, 3, 4, 5)
Notice how the order of atoms in productSmiles has been reversed from
productSmarts (presumably some internal canonicalisation?). Nothing wrong in
that per se, but this reversal is not reflected in the indices returned from
GetSubstructMatch so my attempt to match up the atoms in the two Smiles strings
crashes & burns. Isn't the correct answer here (5, 4, 3, 2, 1, 0) or am I
totally off-beam?
Cheers
-- Ian J. TickleGlobal Phasing Ltd., Cambridge, UK.
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss