Thank you, Brian!
Actually what I expected as output:
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:2])c1[*:3]
S=c1c([*:2])c(Cl)[nH]c([*:1])c1[*:3]
and so on
You gave me the right direction. I can store old-new maps in a dict and
after relabeling and producing of canonical smiles it would be easy to
relabel attachment points back.
Thank you again!
Pavel.
On 05/27/2017 03:03 PM, Brian Kelley wrote:
Pavel, this isn't exactly trivial so I went ahead and made an
example. The basics are that atomMaps are canonicalized, i.e. their
value is used in the generation of smiles.
To solve this problem:
1) backup the atom maps and remove them
2) canonicalize *without* atom maps but figure out the order in which
the atoms in the molecule are output
3) using the atom output order, relabel the atom maps based on output
order.
That's a mouthful, but here's some code that should do the trick:
from rdkit import Chem
smi = ["ClC1=C([*:1])C(=S)C([*:2])=C([*:3])N1",
"ClC1=C([*:1])C(=S)C([*:3])=C([*:2])N1",
"ClC1=C([*:2])C(=S)C([*:1])=C([*:3])N1",
"ClC1=C([*:2])C(=S)C([*:3])=C([*:1])N1",
"ClC1=C([*:3])C(=S)C([*:1])=C([*:2])N1",
"ClC1=C([*:3])C(=S)C([*:2])=C([*:1])N1"]
def CanonicalizeMaps(m, *a, **kw):
# atom maps are canonicalized, so rename them
# figure out where they would have gone
# and relabel from 1...N based on output order
atomMap = "molAtomMapNumber"
backupAtomMap = "oldMolAtomMapNumber"
for atom in m.GetAtoms():
if atom.HasProp(atomMap):
atomNum = atom.GetProp(atomMap)
atom.SetProp(backupAtomMap, atomNum)
atom.ClearProp(atomMap)
# canonicalize
smi = Chem.MolToSmiles(m, *a, **kw)
# where did the atoms end up in the output string?
atoms = [(pos, atom_idx) for atom_idx, pos in enumerate(
eval(m.GetProp("_smilesAtomOutputOrder")))]
atommap = 1
atoms.sort()
# set the new atommap based on output position
for pos, atom_idx in atoms:
atom = m.GetAtomWithIdx(atom_idx)
if atom.HasProp(backupAtomMap):
atom.SetProp(atomMap, str(atommap))
atommap +=1
return Chem.MolToSmiles(m)
for s in smi:
m = Chem.MolFromSmiles(s)
print CanonicalizeMaps(m,True)
Output:
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
Now, if you want the atomMaps in 1...2...3 output order, we could do
that as well, but it is even trickier.
Enjoy,
Brian
On Sat, May 27, 2017 at 8:36 AM, Pavel Polishchuk
<pavel_polishc...@ukr.net <mailto:pavel_polishc...@ukr.net>> wrote:
Hi,
I cannot solve an issue and would like to ask for an advice.
If there are different map numbers for attachment points for the
same fragment different canonical smiles are generated.
I observed such behavior only for fragments with 3 attachment
points. Below is an example.
I'm looking for a solution/workaround how to produce the "same"
smiles strings irrespectively of mapping that after removal of map
numbers smiles will become identical.
Any advice would be appreciated.
smi = ["ClC1=C([*:1])C(=S)C([*:2])=C([*:3])N1",
"ClC1=C([*:1])C(=S)C([*:3])=C([*:2])N1",
"ClC1=C([*:2])C(=S)C([*:1])=C([*:3])N1",
"ClC1=C([*:2])C(=S)C([*:3])=C([*:1])N1",
"ClC1=C([*:3])C(=S)C([*:1])=C([*:2])N1",
"ClC1=C([*:3])C(=S)C([*:2])=C([*:1])N1"]
for s in smi:
print(Chem.MolToSmiles(Chem.MolFromSmiles(s)))
output:
S=c1c([*:1])c(Cl)[nH]c([*:3])c1[*:2]
S=c1c([*:1])c(Cl)[nH]c([*:2])c1[*:3]
S=c1c([*:1])c([*:3])[nH]c(Cl)c1[*:2]
S=c1c([*:2])c(Cl)[nH]c([*:1])c1[*:3]
S=c1c([*:1])c([*:2])[nH]c(Cl)c1[*:3]
S=c1c([*:2])c([*:1])[nH]c(Cl)c1[*:3]
Kind regards,
Pavel.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
<mailto:Rdkit-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
<https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss