Hi, For reasons to complicated to get into here, I ended up with a molecule containing a =CH2 in which one of the hydrogens was explicit and had E/Z stereo info. For example, consider [H]/C=C/F.
I was surprised that RemoveHs() refused to remove the hydrogen, although later I found that that's the documented behavior, and generally it makes sense as a way to prevent the loss of stereochemical information. For example, compare these two: In [7]: Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles('[H]/C=C/F'))) Out[7]: '[H]/C=C/F' In [8]: Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles('[H]C=C/F'))) Out[8]: 'C=CF' A chemist would say that these two are obviously the same molecule, and arguably the second representation is better, because a double bond ending in =CH2 can't have geometric isomers. Maybe it's unreasonable to expect RDKit to make that kind of inference, but still I wonder, what would be a good automated way to get from [H]/C=C/F to C=CF? One idea is to add a "=CH2 cleanup" step, perhaps implemented by applying this reaction: [H][C&h1:1]=[*:2]>>[CH2:1]=[*:2] but perhaps there's a better way? Best, Ivan
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss