Hi Susan, I see why that happens, and I'll let Greg comment if this is a bug or the intended behavior. In the meantime, I can propose a workaround.
The reason why it happens is that aromatization, which is part of the sanitization operations, converts your aromatic query bond into a single bond, probably in the assumption that it was labelled as aromatic by mistake (it is indeed an exocyclic bond and it is not part of a ring in the q_aromatic molecule). You can clearly see that if you carry out sanitization as a separate step: q_aromatic = Chem.MolFromMolBlock(qb_aromatic, sanitize=False) q_aromatic [image: 16cd080c-d76f-457e-a210-1b5f9a347b77.png] for b in q_aromatic.GetBonds(): print(b.GetIdx(), b.GetBondType(), b.DescribeQuery()) 0 DOUBLE 1 SINGLE 2 DOUBLE 3 SINGLE 4 SINGLE 5 DOUBLE 6 AROMATIC Bond 6 is AROMATIC, but bears no query. Let's store an array of the currently aromatic bonds: are_aromatic = [b.GetIdx() for b in q_aromatic.GetBonds() if b.GetIsAromatic()] are_aromatic [6] After sanitization, the aromatic bond turns into a single bond: Chem.SanitizeMol(q_aromatic) rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE for b in q_aromatic.GetBonds(): print(b.GetIdx(), b.GetBondType(), b.DescribeQuery()) 0 AROMATIC 1 AROMATIC 2 AROMATIC 3 AROMATIC 4 AROMATIC 5 AROMATIC 6 SINGLE We know that it was aromatic before sanitization, so let's make it an aromatic query bond: aromatic_query_bond = Chem.MolFromSmarts("*:*").GetBondWithIdx(0) aromatic_query_bond.GetBondType(), aromatic_query_bond.DescribeQuery() (rdkit.Chem.rdchem.BondType.AROMATIC, 'BondOrder 12 = val\n') q_aromatic = Chem.RWMol(q_aromatic) for b_idx in are_aromatic: q_aromatic.ReplaceBond(b_idx, aromatic_query_bond) Now q_aromatic matches as expected: print(m.HasSubstructMatch(q_aromatic)) True Cheers, p. On Tue, Jul 26, 2022 at 6:17 PM Susan Leung <susanhle...@gmail.com> wrote: > Hi all, > > > > Sorry it's me with another substructure query question... > > > > Please can anyone explain the following behaviour to me? I have 4 queries > that differ by just one query bond. To me, it should match an aromatic bond > type (4) but it doesn't. However, it matches single_or_aromatic and > double_or_aromatic query bond types but not single_or_doubleā¦. > > > Best wishes, > > > Susan > > > import rdkit > print(rdkit.__version__) > from rdkit import Chem > > qb_double_or_aromatic = """ > ACCLDraw07262216372D > > 0 0 0 0 0 999 V3000 > M V30 BEGIN CTAB > M V30 COUNTS 7 7 0 0 0 > M V30 BEGIN ATOM > M V30 1 C 45.7538 -37.6779 0 0 > M V30 2 C 44.7323 -37.0872 0 0 > M V30 3 C 44.7323 -35.9099 0 0 > M V30 4 C 45.7553 -35.3193 0 0 > M V30 5 C 46.7807 -35.9049 0 0 > M V30 6 C 46.7807 -37.085 0 0 > M V30 7 O 45.7553 -34.1382 0 0 > M V30 END ATOM > M V30 BEGIN BOND > M V30 1 2 2 1 > M V30 2 1 3 2 > M V30 3 2 4 3 > M V30 4 1 5 4 > M V30 5 1 1 6 > M V30 6 2 6 5 > M V30 7 7 4 7 > M V30 END BOND > M V30 END CTAB > M END > """ > qb_single_or_aromatic = """ > ACCLDraw07262216372D > > 0 0 0 0 0 999 V3000 > M V30 BEGIN CTAB > M V30 COUNTS 7 7 0 0 0 > M V30 BEGIN ATOM > M V30 1 C 45.7538 -37.6779 0 0 > M V30 2 C 44.7323 -37.0872 0 0 > M V30 3 C 44.7323 -35.9099 0 0 > M V30 4 C 45.7553 -35.3193 0 0 > M V30 5 C 46.7807 -35.9049 0 0 > M V30 6 C 46.7807 -37.085 0 0 > M V30 7 O 45.7553 -34.1382 0 0 > M V30 END ATOM > M V30 BEGIN BOND > M V30 1 2 2 1 > M V30 2 1 3 2 > M V30 3 2 4 3 > M V30 4 1 5 4 > M V30 5 1 1 6 > M V30 6 2 6 5 > M V30 7 6 4 7 > M V30 END BOND > M V30 END CTAB > M END > """ > qb_aromatic = """ > ACCLDraw07262216372D > > 0 0 0 0 0 999 V3000 > M V30 BEGIN CTAB > M V30 COUNTS 7 7 0 0 0 > M V30 BEGIN ATOM > M V30 1 C 45.7538 -37.6779 0 0 > M V30 2 C 44.7323 -37.0872 0 0 > M V30 3 C 44.7323 -35.9099 0 0 > M V30 4 C 45.7553 -35.3193 0 0 > M V30 5 C 46.7807 -35.9049 0 0 > M V30 6 C 46.7807 -37.085 0 0 > M V30 7 O 45.7553 -34.1382 0 0 > M V30 END ATOM > M V30 BEGIN BOND > M V30 1 2 2 1 > M V30 2 1 3 2 > M V30 3 2 4 3 > M V30 4 1 5 4 > M V30 5 1 1 6 > M V30 6 2 6 5 > M V30 7 4 4 7 > M V30 END BOND > M V30 END CTAB > M END > """ > qb_single_or_double = """ > ACCLDraw07262216372D > > 0 0 0 0 0 999 V3000 > M V30 BEGIN CTAB > M V30 COUNTS 7 7 0 0 0 > M V30 BEGIN ATOM > M V30 1 C 45.7538 -37.6779 0 0 > M V30 2 C 44.7323 -37.0872 0 0 > M V30 3 C 44.7323 -35.9099 0 0 > M V30 4 C 45.7553 -35.3193 0 0 > M V30 5 C 46.7807 -35.9049 0 0 > M V30 6 C 46.7807 -37.085 0 0 > M V30 7 O 45.7553 -34.1382 0 0 > M V30 END ATOM > M V30 BEGIN BOND > M V30 1 2 2 1 > M V30 2 1 3 2 > M V30 3 2 4 3 > M V30 4 1 5 4 > M V30 5 1 1 6 > M V30 6 2 6 5 > M V30 7 5 4 7 > M V30 END BOND > M V30 END CTAB > M END > """ > mb = """ > ACCLDraw07262216212D > > 0 0 0 0 0 999 V3000 > M V30 BEGIN CTAB > M V30 COUNTS 10 11 0 0 0 > M V30 BEGIN ATOM > M V30 1 O 4.9598 -34.3327 0 0 > M V30 2 O 3.4666 -32.8272 0 0 > M V30 3 C 6.9426 -35.2057 0 0 > M V30 4 C 4.5926 -33.1985 0 0 > M V30 5 C 6.143 -34.3327 0 0 > M V30 6 C 8.0972 -34.9529 0 0 > M V30 7 C 6.5101 -33.1985 0 0 > M V30 8 C 8.4562 -33.8227 0 0 > M V30 9 N 5.5514 -32.509 0 0 CFG=3 > M V30 10 C 7.6606 -32.9537 0 0 > M V30 END ATOM > M V30 BEGIN BOND > M V30 1 1 1 4 > M V30 2 2 4 2 > M V30 3 1 5 3 > M V30 4 1 5 1 > M V30 5 2 3 6 > M V30 6 2 5 7 > M V30 7 1 6 8 > M V30 8 1 7 9 > M V30 9 1 9 4 > M V30 10 1 7 10 > M V30 11 2 10 8 > M V30 END BOND > M V30 END CTAB > M END > """ > m = Chem.MolFromMolBlock(mb) > > q_double_or_aromatic = Chem.MolFromMolBlock(qb_double_or_aromatic) > print(m.HasSubstructMatch(q_double_or_aromatic)) > > q_single_or_aromatic = Chem.MolFromMolBlock(qb_single_or_aromatic) > print(m.HasSubstructMatch(q_single_or_aromatic)) > > q_aromatic = Chem.MolFromMolBlock(qb_aromatic) > print(m.HasSubstructMatch(q_aromatic)) > > q_single_or_double = Chem.MolFromMolBlock(qb_single_or_double) > print(m.HasSubstructMatch(q_single_or_double)) > > > >>> 2022.03.2 > > >>> True>>> True>>> False > > >>> False > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss