Hi Pablo,

thank you for the heads up on that removeHs is not honorred when not
sanitizing (and that removeH has to be done to solve that issue here).

Now I tried with the same molecule but where I also move around on the
order of the atoms (attached as Ran1_neworder.sdf), and here I still get a
different isomeric smiles, eventhough the chiral tag is the same:
for f in ['Ran1.sdf','Ran2.sdf', 'Ran1_neworder.sdf']:
    m = Chem.MolFromMolFile(f, sanitize=False)
    m = Chem.RemoveHs(m, sanitize=False)
    print( Chem.MolToSmiles(set_correct_Chiral_flags(m),
isomericSmiles=True) )


C[C@@H](N)C(=O)O
C[C@@H](N)C(=O)O
C[C@H](N)C(=O)O


On Tue, Dec 3, 2019 at 2:58 PM Paolo Tosco <paolo.tosco.m...@gmail.com>
wrote:

> Hi Rasmus,
>
> the problem is that, as stated in the rdmolfiles.MolFromMolFile() docs,
> the removeHs option is only honored when sanitize is True.
>
> So to obtain sensible results without sanitizing you should rather do
> something like:
>
> m1 = Chem.MolFromMolFile('Ran1.sdf', sanitize=False)
> m1 = Chem.RemoveHs(m1, sanitize=False)
> print( Chem.MolToSmiles(set_correct_Chiral_flags(m1), isomericSmiles=True)
> )
> m2 = Chem.MolFromMolFile('Ran2.sdf', sanitize=False)
> m2 = Chem.RemoveHs(m2, sanitize=False)
> print( Chem.MolToSmiles(set_correct_Chiral_flags(m2), isomericSmiles=True)
> )
>
> You may check the individual sanitization operations here:
>
> https://www.rdkit.org/docs/source/rdkit.Chem.rdmolops.html?highlight=rdmolops%20sanitizeflags#rdkit.Chem.rdmolops.SanitizeFlags
>
> Cheers,
> p.
>
> On 03/12/2019 12:46, Rasmus "Termo" Lundsgaard wrote:
>
> Hi all
>
> I would like to avoid sanitizing the sdf files, as information in these
> files should be seen as the ground truth.
>
> I however have some problems in figuring out how to read and set chiral
> information from the file and also have RDkit behave the same always.
> Attached are two sdf files with no 3d information and only stereo
> information in the atoms section for R-Aniline. The only difference as I
> see it is the order of the lines of the bond information.
> Even so I get two different smiles back with isomeric information when not
> sanitizing.
>
> Attached is also the minimal python code: which for me at least outputs:
>
> not setting chiral flags
>> CC(N)C(=O)O
>> CC(N)C(=O)O
>>
>> setting chiral flags
>> [H]OC(=O)[C@]([H])(N([H])[H])C([H])([H])[H]
>> [H]OC(=O)[C@@]([H])(N([H])[H])C([H])([H])[H]
>>
>> setting chiral flags and sanitize
>> C[C@@H](N)C(=O)O
>> C[C@@H](N)C(=O)O
>>
>
> Any ideas to why this happens and how I can handle it strictly. Also what
> does the sanitizing exactly do?
>
> Regards Rasmus
>
>
>
> _______________________________________________
> Rdkit-discuss mailing 
> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>

Attachment: Ran1_neworder.sdf
Description: StarMath document

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to