That does seem like a bug. You can also see it without involving
DeleteSubstructs, by starting from different SMILES representations of the
same molecule:

>>> m1 = Chem.MolFromSmiles('FC12C3CCCC1C32F')


>>> m2 = Chem.MolFromSmiles('C12C3CCCC1C32')


>>> m3 = Chem.MolFromSmiles('C1CC2C3C(C1)C23')

>>> Chem.MolToSmiles(m2) == Chem.MolToSmiles(m3)

True

>>> m1.GetSubstructMatch(m2)

(1, 2, 3, 4, 5, 6, 7)

>>> m1.GetSubstructMatch(m3)

()

Note that if you parse the problem SMILES as a SMARTS, you do get a match:

>>> m4 = Chem.MolFromSmarts('C1CC2C3C(C1)C23')


>>> m1.GetSubstructMatch(m4)

(4, 3, 2, 1, 6, 5, 7)

Another interesting bit is that while the Inchis of m2 and m3 are also the
same, the conversion produces a warning about stereochemistry:

>>> Chem.MolToInchi(m2) == Chem.MolToInchi(m3)

[18:26:48] WARNING: Omitted undefined stereo
[18:26:48] WARNING: Omitted undefined stereo
True

Ivan

On Wed, Nov 3, 2021 at 3:59 PM Ling Chan <lingtrek...@gmail.com> wrote:

> Dear colleagues,
>
> I have a molecule "FC12C3CCCC1C32F". I stripped it of the F's, and tried
> to do a GetSubstructMatch. It worked. But if I reconstruct the stripped
> molecule from a smiles string, it does not. Please see attached.
>
> I suppose some info is lost when you reconstruct the stripped core from a
> smiles string. But still, I would think it should match anyway.
>
> Another issue is that the 2D depiction has the left most carbons lying
> exactly on top of each other, creating a false impression. A better
> depiction would be like the second attached image. (Not sure if this is
> easy to fix though.)
>
> Thank you for you attention.
>
> Ling
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to