Hi Adelene,

I have updated the gist

https://gist.github.com/ptosco/1e1c23ad24c90444993fa1db21ccb48b

to account for your questions.

Cheers,
p.

On Tue, Oct 20, 2020 at 2:08 PM Adelene LAI <adelene....@uni.lu> wrote:

> Hi Dave and Pablo,
>
>
> Thanks for your helpful replies.
>
>
> @Dave, issue created: https://github.com/rdkit/rdkit/issues/3514
>
>
> @Pablo, your gist shows that the internal representation of the mol does
> indeed factor in undefined stereo, contrary to the way it is depicted.
>
>
> But why then does this happen when I check if the 2 molecules are the same?
>
>
> smi =
> Chem.MolFromSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)C(=CC(C(=O)O)Br)CC(=O)O')
> isosmi =
> Chem.MolFromSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)/C(=C/C(C(=O)O)Br)/CC(=O)O')
> print(smi == isosmi)                    #True, expect False
> print(smi.HasSubstructMatch(isosmi)) #True, expect False
> print(isosmi.HasSubstructMatch(smi))   #True, expect False
> print(smi.HasSubstructMatch(isosmi) and isosmi.HasSubstructMatch(smi))   
> #True,
> expect False
>
>
> However, converting smi and isosmi to canonical smiles and comparing them
> gives False, as expected:
>
> a =
> Chem.CanonSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)C(=CC(C(=O)O)Br)CC(=O)O')
> b =
> Chem.CanonSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)/C(=C/C(C(=O)O)Br)/CC(=O)O')
> a == b       #False
>
>
> (If there are better ways to check if 2 molecules are equal, I'd be
> interested to know.)
>
> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/9DF05ED7-A30E-4742-A568-9B3995689382%40dalkescientific.com/#msg29882815
> ?
>
>
> Adelene
>
>
>
>
>
> Doctoral Researcher
>
> Environmental Cheminformatics
>
> UNIVERSITÉ DU LUXEMBOURG
>
>
> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
>
> 6, avenue du Swing, L-4367 Belvaux
>
> T +356 46 66 44 67 18
>
> [image: github.png] adelenelai
>
>
>
>
>
> ------------------------------
> *From:* Paolo Tosco <paolo.tosco.m...@gmail.com>
> *Sent:* Tuesday, October 20, 2020 1:52:12 PM
> *To:* Adelene LAI
> *Cc:* rdkit-discuss
> *Subject:* Re: [Rdkit-discuss] How to preserve undefined stereochemistry?
>
> Hi Adelene,
>
> this gist
>
> https://gist.github.com/ptosco/1e1c23ad24c90444993fa1db21ccb48b
>
> shows how to add stereo annotations to RDKit 2D depictions, and also how
> to access the double bond stereochemistry programmatically.
>
> Cheers,
> p.
>
>
> On Tue, Oct 20, 2020 at 12:24 PM Adelene LAI <adelene....@uni.lu> wrote:
>
>> Hi RDKit Community,
>>
>>
>> Is there a way to preserve undefined stereochemistry aka unspecified
>> stereochemistry when doing MolFromSmiles?
>>
>> I'm working with a bunch of molecules, some with stereochemistry defined,
>> some without.
>>
>>
>> If stereochemistry is undefined in the SMILES, I would like it to stay
>> that way when converted to a Mol, but this doesn't seem to be the case:
>>
>>
>> > mol =
>> Chem.MolFromSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)C(=CC(C(=O)O)Br)CC(=O)O')
>> > mol
>>
>> One would expect that C=C to either be crossed, as in PubChem's depiction:
>>
>> https://pubchem.ncbi.nlm.nih.gov/compound/139598257#section=2D-Structure
>>
>> <https://pubchem.ncbi.nlm.nih.gov/compound/139598257#section=2D-Structure>
>>
>>
>> or that single bond to be squiggly, as in CDK's depiction:
>>
>> But it's not just a matter of depiction, as it seems internally, mol is
>> equivalent to its stereochem-specific sibling (Entgegen form)
>>
>>
>> CC(C)(C1=CC(=C(C(=C1)Br)O)Br)/C(=C/C(C(=O)O)Br)/CC(=O)O
>>
>>
>>
>> I've tried sanitize=False, but it doesn't seem to have any effect. I
>> would prefer not having to manually SetStereo(Chem.BondStereo.STEREOANY)
>> for every molecule with undefined stereochem (not sure how I would even go
>> about that...).
>>
>>
>> Possibly related to:
>>
>>
>> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570
>>
>>
>>
>> <https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570>
>>
>> https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAHOi4k3revAu-9qhFt0MpUpr0aADQ9d8bV2XT6FurTEKimCQng%40mail.gmail.com/#msg36365128
>> o = Chem.MolFromSmiles('C/C=C/C')
>>
>>
>> <https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570>
>> https://www.rdkit.org/docs/source/rdkit.Chem.EnumerateStereoisomers.html
>>
>> https://github.com/openforcefield/openforcefield/issues/146
>>
>>
>>
>>
>> Any help would be much appreciated.
>>
>>
>> Thanks,
>>
>> Adelene
>>
>>
>>
>>
>>
>>
>>
>>
>> Doctoral Researcher
>>
>> Environmental Cheminformatics
>>
>> UNIVERSITÉ DU LUXEMBOURG
>>
>>
>> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
>>
>> 6, avenue du Swing, L-4367 Belvaux
>>
>> T +356 46 66 44 67 18
>>
>> [image: github.png] adelenelai
>>
>>
>>
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to