I'm working on a translation layer between Schrodinger structures and RDKit mols. Schrodinger structures do not have implicit hydrogens, so I'm struggling a bit to understand how best to treat potentially implicit hydrogens!
What is the correct treatment of bond stereochemistry at centers for which
a hydrogen is required in order to specify the bond stereochemistry? For
example, an imine with a hydrogen substituent (trivial example, F/C=N/[H]).
I notice that when I use the smiles constructor, or if I read from an SDF
file using the SDMolSupplier, the C=N bond in the example shown above is
not recognized as having stereochemistry. However, if I use
removeHydrogens=False in the SDMolSupplier, the bond *is* recognized as Z.
Maybe that can beg presented more clearly as code (here's an interactive
Python shell, I've also attached this as a script, as well as an SDF file).
Python 3.6.2 (default, Jul 21 2017, 13:21:26)
[GCC 4.9.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import rdkit
>>> print(rdkit.__version__)
2017.03.1
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> from rdkit.Chem import rdmolops
>>> def summarize(mol):
... bond = mol.GetBondBetweenAtoms(0, 1)
... atoms = list(bond.GetStereoAtoms())
... atoms.insert(1, bond.GetEndAtom().GetIdx())
... atoms.insert(1, bond.GetBeginAtom().GetIdx())
... print(Chem.MolToSmiles(mol, isomericSmiles=True))
... print(bond.GetStereo(), atoms)
...
>>> has_h = next(Chem.SDMolSupplier('cis_imine.sdf', removeHs=False))
>>> no_h = rdmolops.RemoveHs(has_h)
>>> has_h_again = rdmolops.AddHs(no_h)
>>> summarize(has_h)
[H]/N=C(/[H])F
STEREOZ [3, 0, 1, 2]
>>> summarize(no_h)
N=CF
STEREOZ [1, 0]
>>> summarize(has_h_again)
[H]N=C([H])F
STEREOZ [1, 0]
>>> AllChem.EmbedMolecule(has_h)
0
>>> AllChem.EmbedMolecule(no_h)
0
>>> AllChem.EmbedMolecule(has_h_again)
Fatal Python error: Segmentation fault
Current thread 0x00007faa949d8740 (most recent call first):
File "<stdin>", line 1 in <module>
Segmentation fault
*At core, I have 2 questions:* Is RDKit able to represent stereochemistry
about this bond if the hydrogen is implicit? It's fine if not, I just want
to know. If RDKit can represent stereochemistry for bonds for which one
substituent is hydrogen, what different information do I need to provide
RDKit?
- dan nealschneider
(né wandschneider)
Senior Developer
Schr*ö*dinger, Inc
Portland, OR
cis_imine.sdf
Description: Binary data
"""
Demonstrate my questions about bonds whose stereochemistry is specified
based on a hydrogen, especially when that hydrogen is made implicit.
"""
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import rdmolops
has_h = next(Chem.SDMolSupplier('cis_imine.sdf', removeHs=False))
def summarize(mol, a0=0, a1=1):
bond = mol.GetBondBetweenAtoms(a0, a1)
atoms = list(bond.GetStereoAtoms())
atoms.insert(1, bond.GetEndAtom().GetIdx())
atoms.insert(1, bond.GetBeginAtom().GetIdx())
print(Chem.MolToSmiles(mol, isomericSmiles=True))
print(bond.GetStereo(), atoms)
no_h = rdmolops.RemoveHs(has_h)
has_h_again = rdmolops.AddHs(no_h)
print(rdkit.__version__)
summarize(has_h)
summarize(no_h)
summarize(has_h_again)
AllChem.EmbedMolecule(has_h)
AllChem.EmbedMolecule(no_h)
# This generates a SEGV in my hands. Totalview says it happened in
# _ZN5RDKit12DGeomHelpers14_getAtomStereoEPKNS_4BondEjj, but I
# can't find a getAtomStereo or 2DGeomHelpers in RDKit's github.
AllChem.EmbedMolecule(has_h_again)
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

