Hi Jeffrey,

this gist shows how to achieve what you need:

https://gist.github.com/ptosco/36574d7f025a932bc1b8db221903a8d2

i.e., how to reorder atoms based on the result of Chem.CanonicalRankAtoms().

HTH, cheers
p.

On Fri, Aug 14, 2020 at 8:36 PM Jeffrey Van santen <
jeffrey_van_san...@sfu.ca> wrote:

> Hello all,
>
>
>
> I realize that this topic has been discussed in some detail (
> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/76909664-2C16-4B61-8BEE-2196B3721EA1%40gmail.com/#msg34923617),
> but I remain somewhat confused. Let me layout what I am trying to achieve:
>
>
>
> I would like a method for creating a canonical order of the atoms in a
> molecule, independent of the input order. For example, given
> (R)-1-(sec-butyl)naphthalene (see attached image)
>
> [image: A close up of a logo Description automatically generated]
>
>
>
> if you start with the following smiles string “CC[C@H](C1=CC=CC2=C1C=CC=C2)C”
> versus the InChI string
> “InChI=1S/C14H16/c1-3-11(2)13-10-6-8-12-7-4-5-9-14(12)13/h4-11H,3H2,1-2H3/t11-/m1/s1”,
> you obviously get two different atom orders. I have tried to apply the
> `CanonicalRankAtoms` method to each of the molecules, such as the following
> example code:
>
> ```
>
> from rdkit import Chem
>
>
>
> def atom_order(m):
>
>     return [(x.GetIdx(), x.GetAtomicNum(), x.GetDegree()) for x in
> m.GetAtoms()]
>
>
>
> m = Chem.MolFromSmiles(“CC[C@H](C1=CC=CC2=C1C=CC=C2)C”)
>
> m = Chem.AddHs(m)
>
> m1 =
> Chem.MolFromInchi(“InChI=1S/C14H16/c1-3-11(2)13-10-6-8-12-7-4-5-9-14(12)13/h4-11H,3H2,1-2H3/t11-/m1/s1”)
>
> m1 = Chem.AddHs(m1)
>
> # Some simple comparison of atom ordering
>
> atom_order(m) == atom_order(m1) # returns False
>
> m_order = list(Chem.CanonicalRankAtoms(m))
>
> m1_order = list(Chem.CanonicalRankAtoms(m1))
>
> m_order == m1_order # returns False
>
> # For completeness
>
> m_ordered = Chem.RenumberAtoms(m, m_order)
>
> m1_ordered = Chem.RenumberAtoms(m1, m1_order)
>
> atom_order(m_ordered) == atom_order(m1_ordered) # returns False
>
> ```
>
>
>
> One plausible solution that seems to work, is the following extension:
>
>
>
> ```
>
> m_canon = Chem.MolFromSmiles(Chem.MolToSmiles(m))
>
> m1_canon = Chem.MolFromSmiles(Chem.MolToSmiles(m1))
>
> atom_order(m_canon) == atom_order(m1_canon) # returns True
>
> ```
>
>
>
> I believe this works because by default `MolToSmiles` has the
> `canonical=True`.
>
>
>
> I suppose what I would like to know is
>
>    1. Why does CanonicalAtomRank not return the same result for different
>    inputs of the same molecule. I understand that it has something to do with
>    the underlying molecular graph. In particular, in the linked mail list
>    discussion Greg says (
>    https://sourceforge.net/p/rdkit/mailman/message/34923647/):
>    “If you just want a canonical ordering of the atoms, there is no
>    reason to generate the SMILES. You can just use Chem.CanonicalRankAtoms().”
>    2. Is there a better solution than round tripping from import X format
>    -> export canonical smiles -> import canonical smiles -> export canonical
>    mol (mol file or similar)?
>    3. In a related but tangential questions, is there a way to have
>    canonical smiles without the lowercase aromaticity notation?
>
>
>
> Thank you very much,
>
>
>
> Jeff van Santen
>
> The Natural Products Atlas (www.npatlas.org)
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to