What's going on here is that the RDKit defines stereochemistry based on the ordering of bonds, not atom indices. This has come up on the list multiple times, a relatively recent instance is here: https://www.mail-archive.com/rdkit-discuss@lists.sourceforge.net/msg08955.html
Here's a gist that I have laying around that may help here:[1] https://gist.github.com/greglandrum/9f0e068e53171174b6797348eca64b3e -greg [1] Now if only I could find *why* have that gist laying around... On Tue, Dec 3, 2019 at 3:32 PM Rasmus "Termo" Lundsgaard < termope...@gmail.com> wrote: > Hi Pablo, > > thank you for the heads up on that removeHs is not honorred when not > sanitizing (and that removeH has to be done to solve that issue here). > > Now I tried with the same molecule but where I also move around on the > order of the atoms (attached as Ran1_neworder.sdf), and here I still get a > different isomeric smiles, eventhough the chiral tag is the same: > for f in ['Ran1.sdf','Ran2.sdf', 'Ran1_neworder.sdf']: > m = Chem.MolFromMolFile(f, sanitize=False) > m = Chem.RemoveHs(m, sanitize=False) > print( Chem.MolToSmiles(set_correct_Chiral_flags(m), > isomericSmiles=True) ) > > > C[C@@H](N)C(=O)O > C[C@@H](N)C(=O)O > C[C@H](N)C(=O)O > > > On Tue, Dec 3, 2019 at 2:58 PM Paolo Tosco <paolo.tosco.m...@gmail.com> > wrote: > >> Hi Rasmus, >> >> the problem is that, as stated in the rdmolfiles.MolFromMolFile() docs, >> the removeHs option is only honored when sanitize is True. >> >> So to obtain sensible results without sanitizing you should rather do >> something like: >> >> m1 = Chem.MolFromMolFile('Ran1.sdf', sanitize=False) >> m1 = Chem.RemoveHs(m1, sanitize=False) >> print( Chem.MolToSmiles(set_correct_Chiral_flags(m1), >> isomericSmiles=True) ) >> m2 = Chem.MolFromMolFile('Ran2.sdf', sanitize=False) >> m2 = Chem.RemoveHs(m2, sanitize=False) >> print( Chem.MolToSmiles(set_correct_Chiral_flags(m2), >> isomericSmiles=True) ) >> >> You may check the individual sanitization operations here: >> >> https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=rdmolops%20sanitizeflags#rdkit.Chem.rdmolops.SanitizeFlags >> >> Cheers, >> p. >> >> On 03/12/2019 12:46, Rasmus "Termo" Lundsgaard wrote: >> >> Hi all >> >> I would like to avoid sanitizing the sdf files, as information in these >> files should be seen as the ground truth. >> >> I however have some problems in figuring out how to read and set chiral >> information from the file and also have RDkit behave the same always. >> Attached are two sdf files with no 3d information and only stereo >> information in the atoms section for R-Aniline. The only difference as I >> see it is the order of the lines of the bond information. >> Even so I get two different smiles back with isomeric information when >> not sanitizing. >> >> Attached is also the minimal python code: which for me at least outputs: >> >> not setting chiral flags >>> CC(N)C(=O)O >>> CC(N)C(=O)O >>> >>> setting chiral flags >>> [H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H] >>> [H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H] >>> >>> setting chiral flags and sanitize >>> C[C@@H](N)C(=O)O >>> C[C@@H](N)C(=O)O >>> >> >> Any ideas to why this happens and how I can handle it strictly. Also what >> does the sanitizing exactly do? >> >> Regards Rasmus >> >> >> >> _______________________________________________ >> Rdkit-discuss mailing >> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss