Thank you, Greg and Andrew, for your replies, and I'm glad to hear that this is something that can be fixed within RDKit. I had almost forgotten I had sent this email... :-)
Best, Ivan On Wed, Nov 20, 2019 at 12:17 AM Greg Landrum <greg.land...@gmail.com> wrote: > Hi Ivan, > > I agree that there is a bug here, but I think the problem is actually that > the double bond is being assigned stereochemistry at all in this case. > > In [2]: m = Chem.MolFromSmiles('[H]/C=C/F') > > > > In [3]: m.Debug() > > > Atoms: > 0 1 H chg: 0 deg: 1 exp: 1 imp: 0 hyb: 1 arom?: 0 chi: 0 > 1 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0 > 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0 > 3 9 F chg: 0 deg: 1 exp: 1 imp: 0 hyb: 4 arom?: 0 chi: 0 > Bonds: > 0 0->1 order: 1 dir: 4 conj?: 0 aromatic?: 0 > 1 1->2 order: 2 stereo: 3 stereoAts: (0 3) conj?: 0 aromatic?: 0 > 2 2->3 order: 1 dir: 4 conj?: 0 aromatic?: 0 > > > Given that the two substituents on the first C are the same, the double > bond shouldn't be marked as STEREOE at all. > > I'll get this fixed. > -greg > > > > On Wed, Nov 6, 2019 at 4:34 PM Ivan Tubert-Brohman < > ivan.tubert-broh...@schrodinger.com> wrote: > >> Hi, >> >> For reasons to complicated to get into here, I ended up with a molecule >> containing a =CH2 in which one of the hydrogens was explicit and had E/Z >> stereo info. For example, consider [H]/C=C/F. >> >> I was surprised that RemoveHs() refused to remove the hydrogen, although >> later I found that that's the documented behavior, and generally it makes >> sense as a way to prevent the loss of stereochemical information. >> >> For example, compare these two: >> >> In [7]: Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles('[H]/C=C/F'))) >> Out[7]: '[H]/C=C/F' >> >> In [8]: Chem.MolToSmiles(Chem.RemoveHs(Chem.MolFromSmiles('[H]C=C/F'))) >> Out[8]: 'C=CF' >> >> A chemist would say that these two are obviously the same molecule, and >> arguably the second representation is better, because a double bond ending >> in =CH2 can't have geometric isomers. Maybe it's unreasonable to expect >> RDKit to make that kind of inference, but still I wonder, what would be a >> good automated way to get from [H]/C=C/F to C=CF? >> >> One idea is to add a "=CH2 cleanup" step, perhaps implemented by applying >> this reaction: >> >> [H][C&h1:1]=[*:2]>>[CH2:1]=[*:2] >> >> but perhaps there's a better way? >> >> Best, >> Ivan >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss