Hi,

Stefan found a bug in Orchem, which turns out to be related to 
aromaticity detection in the CDK.

Here's the situation:

         1: C=1N=CNC=1
         2: OCN1C=CN=C1
         3: OC(=O)N1C=CN=C1

The Orchem bug is that 1 and 2 are not considered substructures of 3. 
Reason: the first two rings are considered to be aromatic by the CDK's 
aromaticity detector, the third isn't. But the rings are very similar, 
and Marvin thinks they're all aromatic.

The reason that (3) is not considered aromatic I think has something do 
with atom typing of the Nitrogen that gets the OC and OC(=O) attached.
For the first 2, the atom type is 'N.planar3' which results in a 
hybridization 'PLANAR3', which in turn results in the correct electron 
count for aromaticity. But for the third, the atom type is instead set 
to 'N.amide', hybridization becomes 'SP2', and then electrons don't add 
up for aromaticity.
Well, that's what I see when debugging the code.

Marcus, a Chebi curator said he really wouldn't regard the N in case no 
3. as being part of an amide group.  The main features of the structure 
are the aromatic imidazole ring and the carboxy group. Granted, the 
carboxy group is electron-withdrawing and may tend to disrupt the 
delocalisation of the ring electrons but nevertheless he would still 
regard the system as aromatic.

I created a bug, 
https://sourceforge.net/tracker/?func=detail&aid=3001616&group_id=20024&atid=120024.
Looks quite a complex problem to me. Would it perhaps make sense to add 
methods such as isBenzene, isImidazole etc to CDKAtomTypeMatcher?


Mark

------------------------------------------------------------------------------

_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to