I'm working on a translation layer between Schrodinger structures and RDKit
mols. Schrodinger structures do not have implicit hydrogens, so I'm
struggling a bit to understand how best to treat potentially implicit
hydrogens!

What is the correct treatment of bond stereochemistry at centers for which
a hydrogen is required in order to specify the bond stereochemistry? For
example, an imine with a hydrogen substituent (trivial example, F/C=N/[H]).

I notice that when I use the smiles constructor, or if I read from an SDF
file using the SDMolSupplier, the C=N bond in the example shown above is
not recognized as having stereochemistry. However, if I use
removeHydrogens=False in the SDMolSupplier, the bond *is* recognized as Z.
Maybe that can beg presented more clearly as code (here's an interactive
Python shell, I've also attached this as a script, as well as an SDF file).

Python 3.6.2 (default, Jul 21 2017, 13:21:26)
[GCC 4.9.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import rdkit
>>> print(rdkit.__version__)
2017.03.1
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> from rdkit.Chem import rdmolops
>>> def summarize(mol):
...  bond = mol.GetBondBetweenAtoms(0, 1)
...  atoms = list(bond.GetStereoAtoms())
...  atoms.insert(1, bond.GetEndAtom().GetIdx())
...  atoms.insert(1, bond.GetBeginAtom().GetIdx())
...  print(Chem.MolToSmiles(mol, isomericSmiles=True))
...  print(bond.GetStereo(), atoms)
...
>>> has_h = next(Chem.SDMolSupplier('cis_imine.sdf', removeHs=False))
>>> no_h = rdmolops.RemoveHs(has_h)
>>> has_h_again = rdmolops.AddHs(no_h)
>>> summarize(has_h)
[H]/N=C(/[H])F
STEREOZ [3, 0, 1, 2]
>>> summarize(no_h)
N=CF
STEREOZ [1, 0]
>>> summarize(has_h_again)
[H]N=C([H])F
STEREOZ [1, 0]
>>> AllChem.EmbedMolecule(has_h)
0
>>> AllChem.EmbedMolecule(no_h)
0
>>> AllChem.EmbedMolecule(has_h_again)
Fatal Python error: Segmentation fault

Current thread 0x00007faa949d8740 (most recent call first):
  File "<stdin>", line 1 in <module>
Segmentation fault

*At core, I have 2 questions:* Is RDKit able to represent stereochemistry
about this bond if the hydrogen is implicit? It's fine if not, I just want
to know. If RDKit can represent stereochemistry for bonds for which one
substituent is hydrogen, what different information do I need to provide
RDKit?

- dan nealschneider

(né wandschneider)

Senior Developer
Schr*ö*dinger, Inc
Portland, OR

Attachment: cis_imine.sdf
Description: Binary data

"""
Demonstrate my questions about bonds whose stereochemistry is specified
based on a hydrogen, especially when that hydrogen is made implicit.

"""
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import rdmolops

has_h = next(Chem.SDMolSupplier('cis_imine.sdf', removeHs=False))
def summarize(mol, a0=0, a1=1):
    bond = mol.GetBondBetweenAtoms(a0, a1)
    atoms = list(bond.GetStereoAtoms())
    atoms.insert(1, bond.GetEndAtom().GetIdx())
    atoms.insert(1, bond.GetBeginAtom().GetIdx())
    print(Chem.MolToSmiles(mol, isomericSmiles=True))
    print(bond.GetStereo(), atoms)

no_h = rdmolops.RemoveHs(has_h)
has_h_again = rdmolops.AddHs(no_h)

print(rdkit.__version__)
summarize(has_h)
summarize(no_h)
summarize(has_h_again)
AllChem.EmbedMolecule(has_h)
AllChem.EmbedMolecule(no_h)
# This generates a SEGV in my hands. Totalview says it happened in
# _ZN5RDKit12DGeomHelpers14_getAtomStereoEPKNS_4BondEjj, but I
# can't find a getAtomStereo or 2DGeomHelpers in RDKit's github.
AllChem.EmbedMolecule(has_h_again)

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to