Hi Susan,

I see why that happens, and I'll let Greg comment if this is a bug or the
intended behavior.
In the meantime, I can propose a workaround.

The reason why it happens is that aromatization, which is part of the
sanitization operations, converts your aromatic query bond into a single
bond, probably in the assumption that it was labelled as aromatic by
mistake (it is indeed an exocyclic bond and it is not part of a ring in the
q_aromatic molecule). You can clearly see that if you carry out
sanitization as a separate step:

q_aromatic = Chem.MolFromMolBlock(qb_aromatic, sanitize=False)
q_aromatic
[image: 16cd080c-d76f-457e-a210-1b5f9a347b77.png]
for b in q_aromatic.GetBonds():
    print(b.GetIdx(), b.GetBondType(), b.DescribeQuery())
0 DOUBLE
1 SINGLE
2 DOUBLE
3 SINGLE
4 SINGLE
5 DOUBLE
6 AROMATIC

Bond 6 is AROMATIC, but bears no query.
Let's store an array of the currently aromatic bonds:

are_aromatic = [b.GetIdx() for b in q_aromatic.GetBonds() if
b.GetIsAromatic()]
are_aromatic
[6]

After sanitization, the aromatic bond turns into a single bond:

Chem.SanitizeMol(q_aromatic)
rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
for b in q_aromatic.GetBonds():
    print(b.GetIdx(), b.GetBondType(), b.DescribeQuery())
0 AROMATIC
1 AROMATIC
2 AROMATIC
3 AROMATIC
4 AROMATIC
5 AROMATIC
6 SINGLE

We know that it was aromatic before sanitization, so let's make it an
aromatic query bond:

aromatic_query_bond = Chem.MolFromSmarts("*:*").GetBondWithIdx(0)
aromatic_query_bond.GetBondType(), aromatic_query_bond.DescribeQuery()
(rdkit.Chem.rdchem.BondType.AROMATIC, 'BondOrder 12 = val\n')

q_aromatic = Chem.RWMol(q_aromatic)
for b_idx in are_aromatic:
    q_aromatic.ReplaceBond(b_idx, aromatic_query_bond)

Now q_aromatic matches as expected:

print(m.HasSubstructMatch(q_aromatic))
True

Cheers,
p.


On Tue, Jul 26, 2022 at 6:17 PM Susan Leung <susanhle...@gmail.com> wrote:

> Hi all,
>
>
>
> Sorry it's me with another substructure query question...
>
>
>
> Please can anyone explain the following behaviour to me? I have 4 queries
> that differ by just one query bond. To me, it should match an aromatic bond
> type (4) but it doesn't. However, it matches single_or_aromatic and
> double_or_aromatic query bond types but not single_or_doubleā€¦.
>
>
> Best wishes,
>
>
> Susan
>
>
> import rdkit
> print(rdkit.__version__)
> from rdkit import Chem
>
> qb_double_or_aromatic = """
>   ACCLDraw07262216372D
>
>   0  0  0     0  0            999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 7 7 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C 45.7538 -37.6779 0 0
> M  V30 2 C 44.7323 -37.0872 0 0
> M  V30 3 C 44.7323 -35.9099 0 0
> M  V30 4 C 45.7553 -35.3193 0 0
> M  V30 5 C 46.7807 -35.9049 0 0
> M  V30 6 C 46.7807 -37.085 0 0
> M  V30 7 O 45.7553 -34.1382 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 2 2 1
> M  V30 2 1 3 2
> M  V30 3 2 4 3
> M  V30 4 1 5 4
> M  V30 5 1 1 6
> M  V30 6 2 6 5
> M  V30 7 7 4 7
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> """
> qb_single_or_aromatic = """
>   ACCLDraw07262216372D
>
>   0  0  0     0  0            999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 7 7 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C 45.7538 -37.6779 0 0
> M  V30 2 C 44.7323 -37.0872 0 0
> M  V30 3 C 44.7323 -35.9099 0 0
> M  V30 4 C 45.7553 -35.3193 0 0
> M  V30 5 C 46.7807 -35.9049 0 0
> M  V30 6 C 46.7807 -37.085 0 0
> M  V30 7 O 45.7553 -34.1382 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 2 2 1
> M  V30 2 1 3 2
> M  V30 3 2 4 3
> M  V30 4 1 5 4
> M  V30 5 1 1 6
> M  V30 6 2 6 5
> M  V30 7 6 4 7
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> """
> qb_aromatic = """
>   ACCLDraw07262216372D
>
>   0  0  0     0  0            999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 7 7 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C 45.7538 -37.6779 0 0
> M  V30 2 C 44.7323 -37.0872 0 0
> M  V30 3 C 44.7323 -35.9099 0 0
> M  V30 4 C 45.7553 -35.3193 0 0
> M  V30 5 C 46.7807 -35.9049 0 0
> M  V30 6 C 46.7807 -37.085 0 0
> M  V30 7 O 45.7553 -34.1382 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 2 2 1
> M  V30 2 1 3 2
> M  V30 3 2 4 3
> M  V30 4 1 5 4
> M  V30 5 1 1 6
> M  V30 6 2 6 5
> M  V30 7 4 4 7
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> """
> qb_single_or_double = """
>   ACCLDraw07262216372D
>
>   0  0  0     0  0            999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 7 7 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 C 45.7538 -37.6779 0 0
> M  V30 2 C 44.7323 -37.0872 0 0
> M  V30 3 C 44.7323 -35.9099 0 0
> M  V30 4 C 45.7553 -35.3193 0 0
> M  V30 5 C 46.7807 -35.9049 0 0
> M  V30 6 C 46.7807 -37.085 0 0
> M  V30 7 O 45.7553 -34.1382 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 2 2 1
> M  V30 2 1 3 2
> M  V30 3 2 4 3
> M  V30 4 1 5 4
> M  V30 5 1 1 6
> M  V30 6 2 6 5
> M  V30 7 5 4 7
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> """
> mb = """
>   ACCLDraw07262216212D
>
>   0  0  0     0  0            999 V3000
> M  V30 BEGIN CTAB
> M  V30 COUNTS 10 11 0 0 0
> M  V30 BEGIN ATOM
> M  V30 1 O 4.9598 -34.3327 0 0
> M  V30 2 O 3.4666 -32.8272 0 0
> M  V30 3 C 6.9426 -35.2057 0 0
> M  V30 4 C 4.5926 -33.1985 0 0
> M  V30 5 C 6.143 -34.3327 0 0
> M  V30 6 C 8.0972 -34.9529 0 0
> M  V30 7 C 6.5101 -33.1985 0 0
> M  V30 8 C 8.4562 -33.8227 0 0
> M  V30 9 N 5.5514 -32.509 0 0 CFG=3
> M  V30 10 C 7.6606 -32.9537 0 0
> M  V30 END ATOM
> M  V30 BEGIN BOND
> M  V30 1 1 1 4
> M  V30 2 2 4 2
> M  V30 3 1 5 3
> M  V30 4 1 5 1
> M  V30 5 2 3 6
> M  V30 6 2 5 7
> M  V30 7 1 6 8
> M  V30 8 1 7 9
> M  V30 9 1 9 4
> M  V30 10 1 7 10
> M  V30 11 2 10 8
> M  V30 END BOND
> M  V30 END CTAB
> M  END
> """
> m = Chem.MolFromMolBlock(mb)
>
> q_double_or_aromatic = Chem.MolFromMolBlock(qb_double_or_aromatic)
> print(m.HasSubstructMatch(q_double_or_aromatic))
>
> q_single_or_aromatic = Chem.MolFromMolBlock(qb_single_or_aromatic)
> print(m.HasSubstructMatch(q_single_or_aromatic))
>
> q_aromatic = Chem.MolFromMolBlock(qb_aromatic)
> print(m.HasSubstructMatch(q_aromatic))
>
> q_single_or_double = Chem.MolFromMolBlock(qb_single_or_double)
> print(m.HasSubstructMatch(q_single_or_double))
>
>
> >>> 2022.03.2
>
> >>> True>>> True>>> False
>
> >>> False
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to