Hi,

For reasons to complicated to get into here, I ended up with a molecule
containing a =CH2 in which one of the hydrogens was explicit and had E/Z
stereo info. For example, consider [H]/C=C/F.

I was surprised that RemoveHs() refused to remove the hydrogen, although
later I found that that's the documented behavior, and generally it makes
sense as a way to prevent the loss of stereochemical information.

For example, compare these two:

In [7]: Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles('[H]/C=C/F')))
Out[7]: '[H]/C=C/F'

In [8]: Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles('[H]C=C/F')))
Out[8]: 'C=CF'

A chemist would say that these two are obviously the same molecule, and
arguably the second representation is better, because a double bond ending
in =CH2 can't have geometric isomers. Maybe it's unreasonable to expect
RDKit to make that kind of inference, but still I wonder, what would be a
good automated way to get from [H]/C=C/F to C=CF?

One idea is to add a "=CH2 cleanup" step, perhaps implemented by applying
this reaction:

    [H][C&h1:1]=[*:2]>>[CH2:1]=[*:2]

but perhaps there's a better way?

Best,
Ivan
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to