Hi all,
canonicalization of fragment SMILES does not work properly. Below
there are two examples of identical fragments. The only difference is
the order of atoms (indices). However, it seems that RDKit
canonicalization does not take into account atom types.
Does someone have an idea how to solve this issue with small losses?
#1 ===========
m = RWMol()
for i in range(3):
a = Atom(6)
m.AddAtom(a)
a = Atom(0)
m.AddAtom(a)
m.GetAtomWithIdx(0).SetIsAromatic(True) # set atom 0 as aromatic
m.GetAtomWithIdx(3).SetAtomMapNum(1)
m.AddBond(0, 1, Chem.rdchem.BondType.SINGLE)
m.AddBond(1, 2, Chem.rdchem.BondType.SINGLE)
m.AddBond(1, 3, Chem.rdchem.BondType.SINGLE)
Chem.MolToSmiles(m)
OUTPUT: 'cC(C)[*:1]'
#2 ===========
m2 = RWMol()
for i in range(3):
a = Atom(6)
m2.AddAtom(a)
a = Atom(0)
m2.AddAtom(a)
m2.GetAtomWithIdx(2).SetIsAromatic(True) # set atom 2 as aromatic
m2.GetAtomWithIdx(3).SetAtomMapNum(1)
m2.AddBond(0, 1, Chem.rdchem.BondType.SINGLE)
m2.AddBond(1, 2, Chem.rdchem.BondType.SINGLE)
m2.AddBond(1, 3, Chem.rdchem.BondType.SINGLE)
Chem.MolToSmiles(m2)
OUTPUT: 'CC(c)[*:1]'
Pavel.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss