That does seem like a bug. You can also see it without involving DeleteSubstructs, by starting from different SMILES representations of the same molecule:
>>> m1 = Chem.MolFromSmiles('FC12C3CCCC1C32F') >>> m2 = Chem.MolFromSmiles('C12C3CCCC1C32') >>> m3 = Chem.MolFromSmiles('C1CC2C3C(C1)C23') >>> Chem.MolToSmiles(m2) == Chem.MolToSmiles(m3) True >>> m1.GetSubstructMatch(m2) (1, 2, 3, 4, 5, 6, 7) >>> m1.GetSubstructMatch(m3) () Note that if you parse the problem SMILES as a SMARTS, you do get a match: >>> m4 = Chem.MolFromSmarts('C1CC2C3C(C1)C23') >>> m1.GetSubstructMatch(m4) (4, 3, 2, 1, 6, 5, 7) Another interesting bit is that while the Inchis of m2 and m3 are also the same, the conversion produces a warning about stereochemistry: >>> Chem.MolToInchi(m2) == Chem.MolToInchi(m3) [18:26:48] WARNING: Omitted undefined stereo [18:26:48] WARNING: Omitted undefined stereo True Ivan On Wed, Nov 3, 2021 at 3:59 PM Ling Chan <lingtrek...@gmail.com> wrote: > Dear colleagues, > > I have a molecule "FC12C3CCCC1C32F". I stripped it of the F's, and tried > to do a GetSubstructMatch. It worked. But if I reconstruct the stripped > molecule from a smiles string, it does not. Please see attached. > > I suppose some info is lost when you reconstruct the stripped core from a > smiles string. But still, I would think it should match anyway. > > Another issue is that the 2D depiction has the left most carbons lying > exactly on top of each other, creating a false impression. A better > depiction would be like the second attached image. (Not sure if this is > easy to fix though.) > > Thank you for you attention. > > Ling > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss