On Nov 6, 2019, at 16:32, Ivan Tubert-Brohman
<[email protected]> wrote:
> For reasons to complicated to get into here, I ended up with a molecule
> containing a =CH2 in which one of the hydrogens was explicit and had E/Z
> stereo info. For example, consider [H]/C=C/F.
FWIW, I just ran into the same issue.
In my case, I'm using one of my favorite techniques - SMILES manipulation - to
replace terminal atoms with a hydrogen. [1]
I thought I could replace F/C=C/Cl with [H]/C=C/Cl to delete the F, and was
surprised to see the '[H]' in place after re-canonicalization.
I would like some what to get rid of it. I ended up using a manual version of
the transform that Ivan recommended.
Andrew
[email protected]
[1] Why? I find it hard to remove an atom correctly in the RDKit, and preserve
stereochemistry. Here I'll take a SMILES string and re-create it so the 6th
atom of the input SMILES is the first atom of the output SMILES (the input
SMILES is canonical):
>>> from rdkit import Chem
>>> mol = Chem.MolFromSmiles(r"CC(=O)/C=C(\O)c1cccnc1")
>>> Chem.MolToSmiles(mol, rootedAtAtom=5)
'O/C(=C\\C(C)=O)c1cccnc1'
>>> Chem.CanonSmiles('O/C(=C\\C(C)=O)c1cccnc1')
'CC(=O)/C=C(\\O)c1cccnc1'
If I replace the first atom term, "O", with an "[H]", I have double bond
stereochemistry in the re-canonicalized output:
>>> Chem.CanonSmiles('[H]/C(=C\\C(C)=O)c1cccnc1')
'CC(=O)/C=C/c1cccnc1'
However, if I use graph edit methods to remove that same atom, I no longer have
double bond stereochemistry:
>>> rwmol = Chem.RWMol(mol)
>>> rwmol.RemoveAtom(5)
>>> Chem.MolToSmiles(rwmol)
'CC(=O)C=Cc1cccnc1'
The issue is likely because the stereochemistry information isn't specified on
all of the internal bonds. Consider that I can add a "/" to get the same SMILES:
>>> Chem.CanonSmiles(r"CC(=O)/C=C(\O)c1cccnc1") # start
'CC(=O)/C=C(\\O)c1cccnc1'
>>> Chem.CanonSmiles(r"CC(=O)/C=C(\O)/c1cccnc1") # add an extra "/" to
>>> "c1cccnc1"
'CC(=O)/C=C(\\O)c1cccnc1'
>>> Chem.CanonSmiles(r"CC(=O)/C=C(O)/c1cccnc1") # remove the "\" from "\O"
'CC(=O)/C=C(\\O)c1cccnc1'
>>> Chem.CanonSmiles(r"CC(=O)/C=C([H])/c1cccnc1") # change the O to [H] to get
>>> what I expected
'CC(=O)/C=C/c1cccnc1'
More specifically, it matches what I get from graph operations:
>>> mol2 = Chem.MolFromSmiles(r"CC(=O)/C=C(O)/c1cccnc1")
>>> rwmol2 = Chem.RWMol(mol2)
>>> rwmol2.RemoveAtom(5)
>>> Chem.MolToSmiles(rwmol2)
'CC(=O)/C=C/c1cccnc1'
I don't know how to do this programmatically with the RDKit API.
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss