Hi Paolo,

Great thanks for this, that's very helpful.

I am actually using this query in the postgres cartridge and am seeing
similar behaviour using version 2021.09.2 but I think this might be fixed
in 2021.09.3 with #4787 <https://github.com/rdkit/rdkit/issues/4787> . However,
I think the conda package ( https://anaconda.org/rdkit/rdkit-postgresql )
is still on version 2021.09.2, would it be possible to update this?

But I also wanted to ask about the concept of a qmol in the cartridge that
doesn't undergo sanitization versus the corresponding behaviour in Python?
Please correct me if I'm wrong, but there is no concept of a qmol in
Python?

Many thanks!

Susan

On Tue, Jul 26, 2022 at 9:04 PM Paolo Tosco <paolo.tosco.m...@gmail.com>
wrote:

> Hi Susan,
>
> I see why that happens, and I'll let Greg comment if this is a bug or the
> intended behavior.
> In the meantime, I can propose a workaround.
>
> The reason why it happens is that aromatization, which is part of the
> sanitization operations, converts your aromatic query bond into a single
> bond, probably in the assumption that it was labelled as aromatic by
> mistake (it is indeed an exocyclic bond and it is not part of a ring in the
> q_aromatic molecule). You can clearly see that if you carry out
> sanitization as a separate step:
>
> q_aromatic = Chem.MolFromMolBlock(qb_aromatic, sanitize=False)
> q_aromatic
> [image: 16cd080c-d76f-457e-a210-1b5f9a347b77.png]
> for b in q_aromatic.GetBonds():
>     print(b.GetIdx(), b.GetBondType(), b.DescribeQuery())
> 0 DOUBLE
> 1 SINGLE
> 2 DOUBLE
> 3 SINGLE
> 4 SINGLE
> 5 DOUBLE
> 6 AROMATIC
>
> Bond 6 is AROMATIC, but bears no query.
> Let's store an array of the currently aromatic bonds:
>
> are_aromatic = [b.GetIdx() for b in q_aromatic.GetBonds() if
> b.GetIsAromatic()]
> are_aromatic
> [6]
>
> After sanitization, the aromatic bond turns into a single bond:
>
> Chem.SanitizeMol(q_aromatic)
> rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
> for b in q_aromatic.GetBonds():
>     print(b.GetIdx(), b.GetBondType(), b.DescribeQuery())
> 0 AROMATIC
> 1 AROMATIC
> 2 AROMATIC
> 3 AROMATIC
> 4 AROMATIC
> 5 AROMATIC
> 6 SINGLE
>
> We know that it was aromatic before sanitization, so let's make it an
> aromatic query bond:
>
> aromatic_query_bond = Chem.MolFromSmarts("*:*").GetBondWithIdx(0)
> aromatic_query_bond.GetBondType(), aromatic_query_bond.DescribeQuery()
> (rdkit.Chem.rdchem.BondType.AROMATIC, 'BondOrder 12 = val\n')
>
> q_aromatic = Chem.RWMol(q_aromatic)
> for b_idx in are_aromatic:
>     q_aromatic.ReplaceBond(b_idx, aromatic_query_bond)
>
> Now q_aromatic matches as expected:
>
> print(m.HasSubstructMatch(q_aromatic))
> True
>
> Cheers,
> p.
>
>
> On Tue, Jul 26, 2022 at 6:17 PM Susan Leung <susanhle...@gmail.com> wrote:
>
>> Hi all,
>>
>>
>>
>> Sorry it's me with another substructure query question...
>>
>>
>>
>> Please can anyone explain the following behaviour to me? I have 4 queries
>> that differ by just one query bond. To me, it should match an aromatic bond
>> type (4) but it doesn't. However, it matches single_or_aromatic and
>> double_or_aromatic query bond types but not single_or_double….
>>
>>
>> Best wishes,
>>
>>
>> Susan
>>
>>
>> import rdkit
>> print(rdkit.__version__)
>> from rdkit import Chem
>>
>> qb_double_or_aromatic = """
>>   ACCLDraw07262216372D
>>
>>   0  0  0     0  0            999 V3000
>> M  V30 BEGIN CTAB
>> M  V30 COUNTS 7 7 0 0 0
>> M  V30 BEGIN ATOM
>> M  V30 1 C 45.7538 -37.6779 0 0
>> M  V30 2 C 44.7323 -37.0872 0 0
>> M  V30 3 C 44.7323 -35.9099 0 0
>> M  V30 4 C 45.7553 -35.3193 0 0
>> M  V30 5 C 46.7807 -35.9049 0 0
>> M  V30 6 C 46.7807 -37.085 0 0
>> M  V30 7 O 45.7553 -34.1382 0 0
>> M  V30 END ATOM
>> M  V30 BEGIN BOND
>> M  V30 1 2 2 1
>> M  V30 2 1 3 2
>> M  V30 3 2 4 3
>> M  V30 4 1 5 4
>> M  V30 5 1 1 6
>> M  V30 6 2 6 5
>> M  V30 7 7 4 7
>> M  V30 END BOND
>> M  V30 END CTAB
>> M  END
>> """
>> qb_single_or_aromatic = """
>>   ACCLDraw07262216372D
>>
>>   0  0  0     0  0            999 V3000
>> M  V30 BEGIN CTAB
>> M  V30 COUNTS 7 7 0 0 0
>> M  V30 BEGIN ATOM
>> M  V30 1 C 45.7538 -37.6779 0 0
>> M  V30 2 C 44.7323 -37.0872 0 0
>> M  V30 3 C 44.7323 -35.9099 0 0
>> M  V30 4 C 45.7553 -35.3193 0 0
>> M  V30 5 C 46.7807 -35.9049 0 0
>> M  V30 6 C 46.7807 -37.085 0 0
>> M  V30 7 O 45.7553 -34.1382 0 0
>> M  V30 END ATOM
>> M  V30 BEGIN BOND
>> M  V30 1 2 2 1
>> M  V30 2 1 3 2
>> M  V30 3 2 4 3
>> M  V30 4 1 5 4
>> M  V30 5 1 1 6
>> M  V30 6 2 6 5
>> M  V30 7 6 4 7
>> M  V30 END BOND
>> M  V30 END CTAB
>> M  END
>> """
>> qb_aromatic = """
>>   ACCLDraw07262216372D
>>
>>   0  0  0     0  0            999 V3000
>> M  V30 BEGIN CTAB
>> M  V30 COUNTS 7 7 0 0 0
>> M  V30 BEGIN ATOM
>> M  V30 1 C 45.7538 -37.6779 0 0
>> M  V30 2 C 44.7323 -37.0872 0 0
>> M  V30 3 C 44.7323 -35.9099 0 0
>> M  V30 4 C 45.7553 -35.3193 0 0
>> M  V30 5 C 46.7807 -35.9049 0 0
>> M  V30 6 C 46.7807 -37.085 0 0
>> M  V30 7 O 45.7553 -34.1382 0 0
>> M  V30 END ATOM
>> M  V30 BEGIN BOND
>> M  V30 1 2 2 1
>> M  V30 2 1 3 2
>> M  V30 3 2 4 3
>> M  V30 4 1 5 4
>> M  V30 5 1 1 6
>> M  V30 6 2 6 5
>> M  V30 7 4 4 7
>> M  V30 END BOND
>> M  V30 END CTAB
>> M  END
>> """
>> qb_single_or_double = """
>>   ACCLDraw07262216372D
>>
>>   0  0  0     0  0            999 V3000
>> M  V30 BEGIN CTAB
>> M  V30 COUNTS 7 7 0 0 0
>> M  V30 BEGIN ATOM
>> M  V30 1 C 45.7538 -37.6779 0 0
>> M  V30 2 C 44.7323 -37.0872 0 0
>> M  V30 3 C 44.7323 -35.9099 0 0
>> M  V30 4 C 45.7553 -35.3193 0 0
>> M  V30 5 C 46.7807 -35.9049 0 0
>> M  V30 6 C 46.7807 -37.085 0 0
>> M  V30 7 O 45.7553 -34.1382 0 0
>> M  V30 END ATOM
>> M  V30 BEGIN BOND
>> M  V30 1 2 2 1
>> M  V30 2 1 3 2
>> M  V30 3 2 4 3
>> M  V30 4 1 5 4
>> M  V30 5 1 1 6
>> M  V30 6 2 6 5
>> M  V30 7 5 4 7
>> M  V30 END BOND
>> M  V30 END CTAB
>> M  END
>> """
>> mb = """
>>   ACCLDraw07262216212D
>>
>>   0  0  0     0  0            999 V3000
>> M  V30 BEGIN CTAB
>> M  V30 COUNTS 10 11 0 0 0
>> M  V30 BEGIN ATOM
>> M  V30 1 O 4.9598 -34.3327 0 0
>> M  V30 2 O 3.4666 -32.8272 0 0
>> M  V30 3 C 6.9426 -35.2057 0 0
>> M  V30 4 C 4.5926 -33.1985 0 0
>> M  V30 5 C 6.143 -34.3327 0 0
>> M  V30 6 C 8.0972 -34.9529 0 0
>> M  V30 7 C 6.5101 -33.1985 0 0
>> M  V30 8 C 8.4562 -33.8227 0 0
>> M  V30 9 N 5.5514 -32.509 0 0 CFG=3
>> M  V30 10 C 7.6606 -32.9537 0 0
>> M  V30 END ATOM
>> M  V30 BEGIN BOND
>> M  V30 1 1 1 4
>> M  V30 2 2 4 2
>> M  V30 3 1 5 3
>> M  V30 4 1 5 1
>> M  V30 5 2 3 6
>> M  V30 6 2 5 7
>> M  V30 7 1 6 8
>> M  V30 8 1 7 9
>> M  V30 9 1 9 4
>> M  V30 10 1 7 10
>> M  V30 11 2 10 8
>> M  V30 END BOND
>> M  V30 END CTAB
>> M  END
>> """
>> m = Chem.MolFromMolBlock(mb)
>>
>> q_double_or_aromatic = Chem.MolFromMolBlock(qb_double_or_aromatic)
>> print(m.HasSubstructMatch(q_double_or_aromatic))
>>
>> q_single_or_aromatic = Chem.MolFromMolBlock(qb_single_or_aromatic)
>> print(m.HasSubstructMatch(q_single_or_aromatic))
>>
>> q_aromatic = Chem.MolFromMolBlock(qb_aromatic)
>> print(m.HasSubstructMatch(q_aromatic))
>>
>> q_single_or_double = Chem.MolFromMolBlock(qb_single_or_double)
>> print(m.HasSubstructMatch(q_single_or_double))
>>
>>
>> >>> 2022.03.2
>>
>> >>> True>>> True>>> False
>>
>> >>> False
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to