Hi all, I am having some issues with the CDK library.
I have the molecule "glycitein" in the attached file (glycitein.sdf). I am running the SMARTSQueryTool to perform structure search. The SMARTS patterns are the following: P1: [O;X1]=[#6;R1]-1-[#6;R1](=[#6;R1]-[#8]-c2ccccc-12)-[c;R1]1[c;R1][c;R1][c;R1][c;R1][c;R1]1 P2: [O;X1]=[#6;R1]-1-[#6;R1](=[#6;R1]-[#8]-[#6]-2=[#6]-[#6]=[#6]-[#6]=[#6]-1-2)-[#6;R1]-1=[#6;R1]-[#6;R1]=[#6;R1]-[#6;R1]=[#6;R1]-1 For each of those, the query tool returns false, which is really surprising. I imagine it still has to do with the Aromaticity detection or a related issue. I have tried many things and it seems that they do not always work as they should. 1) I therefore preprocessed the molecule using the code below (from a previous chat I had on a forum): SMSDNormalizer.percieveAtomTypesAndConfigureAtoms(molecule); CDKHydrogenAdder.getInstance(molecule.getBuilder()) .addImplicitHydrogens(molecule); for (IBond bond : molecule.bonds()) { if (bond.getFlag(CDKConstants.SINGLE_OR_DOUBLE)) { bond.setFlag(CDKConstants.ISAROMATIC, true); bond.getAtom(0).setFlag(CDKConstants.ISAROMATIC, true); bond.getAtom(1).setFlag(CDKConstants.ISAROMATIC, true); } } SMSDNormalizer.aromatizeMolecule(molecule); I attached the resulting structure in SDF format as returned by CDK ((glycitein_processed.sdf)), which in most editors is shown as in the attached picture. It seems that all the aromatic bonds (marked as 4) in the SDF are perceived as single bonds. Therefore, the result of the structure search is still "FALSE". By the way, trying a combination of AtomContainerManipulator (to perceive atom types) and Aromaticity <http://cdk.github.io/cdk/1.5/docs/api/org/openscience/cdk/aromaticity/Aromaticity.html> did not help either 2) Instead of aromatizing, I removed the SMSDNormalizer lines, and added the following: AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(molecule); Kekulization.kekulize(molecule); The SDF of the resulting molecule is the same. The result also. How can I process these molecules efficiently? I am writing a function that will take SDF files, and run the SMARTSQueryTool to match certain patterns. Therefore, I need an efficient way to preprocess these molecules. Can someone help me out here? Thank you in advance. Best,
glycitein_processed.sdf
Description: Binary data
glycitein.sdf
Description: Binary data
------------------------------------------------------------------------------ Mobile security can be enabling, not merely restricting. Employees who bring their own devices (BYOD) to work are irked by the imposition of MDM restrictions. Mobile Device Manager Plus allows you to control only the apps on BYO-devices by containerizing them, leaving personal data untouched! https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user