Hi Pablo, thank you for the heads up on that removeHs is not honorred when not sanitizing (and that removeH has to be done to solve that issue here).
Now I tried with the same molecule but where I also move around on the order of the atoms (attached as Ran1_neworder.sdf), and here I still get a different isomeric smiles, eventhough the chiral tag is the same: for f in ['Ran1.sdf','Ran2.sdf', 'Ran1_neworder.sdf']: m = Chem.MolFromMolFile(f, sanitize=False) m = Chem.RemoveHs(m, sanitize=False) print( Chem.MolToSmiles(set_correct_Chiral_flags(m), isomericSmiles=True) ) C[C@@H](N)C(=O)O C[C@@H](N)C(=O)O C[C@H](N)C(=O)O On Tue, Dec 3, 2019 at 2:58 PM Paolo Tosco <paolo.tosco.m...@gmail.com> wrote: > Hi Rasmus, > > the problem is that, as stated in the rdmolfiles.MolFromMolFile() docs, > the removeHs option is only honored when sanitize is True. > > So to obtain sensible results without sanitizing you should rather do > something like: > > m1 = Chem.MolFromMolFile('Ran1.sdf', sanitize=False) > m1 = Chem.RemoveHs(m1, sanitize=False) > print( Chem.MolToSmiles(set_correct_Chiral_flags(m1), isomericSmiles=True) > ) > m2 = Chem.MolFromMolFile('Ran2.sdf', sanitize=False) > m2 = Chem.RemoveHs(m2, sanitize=False) > print( Chem.MolToSmiles(set_correct_Chiral_flags(m2), isomericSmiles=True) > ) > > You may check the individual sanitization operations here: > > https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=rdmolops%20sanitizeflags#rdkit.Chem.rdmolops.SanitizeFlags > > Cheers, > p. > > On 03/12/2019 12:46, Rasmus "Termo" Lundsgaard wrote: > > Hi all > > I would like to avoid sanitizing the sdf files, as information in these > files should be seen as the ground truth. > > I however have some problems in figuring out how to read and set chiral > information from the file and also have RDkit behave the same always. > Attached are two sdf files with no 3d information and only stereo > information in the atoms section for R-Aniline. The only difference as I > see it is the order of the lines of the bond information. > Even so I get two different smiles back with isomeric information when not > sanitizing. > > Attached is also the minimal python code: which for me at least outputs: > > not setting chiral flags >> CC(N)C(=O)O >> CC(N)C(=O)O >> >> setting chiral flags >> [H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H] >> [H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H] >> >> setting chiral flags and sanitize >> C[C@@H](N)C(=O)O >> C[C@@H](N)C(=O)O >> > > Any ideas to why this happens and how I can handle it strictly. Also what > does the sanitizing exactly do? > > Regards Rasmus > > > > _______________________________________________ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > >
Ran1_neworder.sdf
Description: StarMath document
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss