Hi,
Stefan found a bug in Orchem, which turns out to be related to aromaticity detection in the CDK. Here's the situation: 1: C=1N=CNC=1 2: OCN1C=CN=C1 3: OC(=O)N1C=CN=C1 The Orchem bug is that 1 and 2 are not considered substructures of 3. Reason: the first two rings are considered to be aromatic by the CDK's aromaticity detector, the third isn't. But the rings are very similar, and Marvin thinks they're all aromatic. The reason that (3) is not considered aromatic I think has something do with atom typing of the Nitrogen that gets the OC and OC(=O) attached. For the first 2, the atom type is 'N.planar3' which results in a hybridization 'PLANAR3', which in turn results in the correct electron count for aromaticity. But for the third, the atom type is instead set to 'N.amide', hybridization becomes 'SP2', and then electrons don't add up for aromaticity. Well, that's what I see when debugging the code. Marcus, a Chebi curator said he really wouldn't regard the N in case no 3. as being part of an amide group. The main features of the structure are the aromatic imidazole ring and the carboxy group. Granted, the carboxy group is electron-withdrawing and may tend to disrupt the delocalisation of the ring electrons but nevertheless he would still regard the system as aromatic. I created a bug, https://sourceforge.net/tracker/?func=detail&aid=3001616&group_id=20024&atid=120024. Looks quite a complex problem to me. Would it perhaps make sense to add methods such as isBenzene, isImidazole etc to CDKAtomTypeMatcher? Mark ------------------------------------------------------------------------------ _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user