Hi RDKit Community,

Is there a way to preserve undefined stereochemistry aka unspecified 
stereochemistry when doing MolFromSmiles?


I'm working with a bunch of molecules, some with stereochemistry defined, some 
without.


If stereochemistry is undefined in the SMILES, I would like it to stay that way 
when converted to a Mol, but this doesn't seem to be the case:


> mol = 
> Chem.MolFromSmiles('CC(C)(C1=CC(=C(C(=C1)Br)O)Br)C(=CC(C(=O)O)Br)CC(=O)O')
> mol

[https://owa.uni.lu/owa/service.svc/s/GetFileAttachment?id=AAMkAGZmYjQwYmQ2LTcxODYtNDNhYi1hNTZlLTFiNDgxODA0MjNiZQBGAAAAAADhez7GVLyNT6vooKL2ihHhBwBuSX%2BNSPCHQainUEFyygsfAAAB%2B4B1AABuSX%2BNSPCHQainUEFyygsfAAGQzO9iAAABEgAQACo4Qhn9gSVGjyknvlrNy9g%3D&X-OWA-CANARY=KzXvJGD5S0GSEPfNkS5fZYDFe7bcdNgIObv5ckhjF4wefmj-g3q1TT_E6gcW1r5xr5EjBUEwMBo.&isImagePreview=True]

One would expect that C=C to either be crossed, as in PubChem's depiction:

https://pubchem.ncbi.nlm.nih.gov/compound/139598257#section=2D-Structure

[https://lh6.googleusercontent.com/qcj3x-KsughszG8tryquO6V-VDfqWT0oNF-LfA0jHbbue2pSzA69HqOAWsa_34FYyxQKfTdJv6gWeIsXW-hhNglMy4_rpf6l_x-Y3ufGRpuz_c1ZCK69k4VKVmE1Cq93rhdD7a7ij8U]<https://pubchem.ncbi.nlm.nih.gov/compound/139598257#section=2D-Structure>


or that single bond to be squiggly, as in CDK's depiction:

[https://www.simolecule.com/cdkdepict/depict/bow/svg?smi=CC(C)(C1%3DCC(%3DC(C(%3DC1)Br)O)Br)C(%3DCC(C(%3DO)O)Br)CC(%3DO)O&w=80&h=50&abbr=on&hdisp=bridgehead&showtitle=false&zoom=1.6&annotate=none]

But it's not just a matter of depiction, as it seems internally, mol is 
equivalent to its stereochem-specific sibling (Entgegen form)


CC(C)(C1=CC(=C(C(=C1)Br)O)Br)/C(=C/C(C(=O)O)Br)/CC(=O)O



I've tried sanitize=False, but it doesn't seem to have any effect. I would 
prefer not having to manually SetStereo(Chem.BondStereo.STEREOANY) for every 
molecule with undefined stereochem (not sure how I would even go about that...).


Possibly related to:

https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570


<https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570>

<https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/C00BE94F-6F6F-466A-83D4-3045C9006026%40gmail.com/#msg34929570>
https://sourceforge.net/p/rdkit/mailman/rdkit-discuss/thread/CAHOi4k3revAu-9qhFt0MpUpr0aADQ9d8bV2XT6FurTEKimCQng%40mail.gmail.com/#msg36365128
o = Chem.MolFromSmiles('C/C=C/C')

https://www.rdkit.org/docs/source/rdkit.Chem.EnumerateStereoisomers.html

https://github.com/openforcefield/openforcefield/issues/146




Any help would be much appreciated.


Thanks,

Adelene







Doctoral Researcher
Environmental Cheminformatics
UNIVERSITÉ DU LUXEMBOURG

LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE
6, avenue du Swing, L-4367 Belvaux
T +356 46 66 44 67 18
[github.png] adelenelai




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to