Hi Rajarshi, The original data is from the CIF file. This data is in the java code too. I attach the cif file here.
Best regards, Adel. Rajarshi Guha wrote:
Thanks. Interestingly, both molecules get canonicalized to the same SMILES. But the fingerprinter gives different numbers of paths for the two moleules Can you send me the original SD files? On Mon, Oct 4, 2010 at 4:04 PM, Adel Golovin <[email protected]> wrote:Hi Rajarshi, I attach here the java file with the structures : BF5_MDL2 and BF5_MDL3 This java file contails other tests too, which you do not need. To compile it remover everything except cdktestBF5 Best regards, Adel. Rajarshi Guha wrote:Could you send the structures as attachments? Gmail is mangling the lines On Mon, Oct 4, 2010 at 11:59 AM, Adel Golovin <[email protected]> wrote:Hi Rajarshi, In the test example both structures have single or double bonds. There is no aromaticity given and this particular structure (BF5) does not have aromatic rings. The problem is that the isomorphism test gives a positive result whether the fingerprints are different. The difference between the data is that in the first one bonds are ordered by the bonds order (double go first, then single) and in the second there is no order. Best regards, Adel. Rajarshi Guha wrote:Hmm, does this problem get resolved if you explicitly perceive aromaticity? The hashed FP considers aromatic bonds and so bond ordering shouldn't affect the fp (but will if it only sees single/double bonds) On Mon, Oct 4, 2010 at 10:46 AM, Adel Golovin <[email protected]> wrote:Hi Egon, thank for looking into it. The example with 1FH concern the aromaticity test and I'm going to investigate it further. Consider different fingerprints for isomorphic structures: The structure is BF5: http://www.ebi.ac.uk/pdbe-site/pdbemotif/chem/detail.jsp?code=BF5 The code below has two different MDL files representing the same structure. I tested these structures whether they are isomorphic using the following code and generate the same fingerprints. The result is that they are isomorphic but generate different fingerprints. The only difference between them is the order of bonds. the code: import java.io.IOException; import java.io.Reader; import java.io.StringReader; import java.util.Arrays; import java.util.BitSet; import java.util.Comparator; import java.util.logging.Logger; import org.junit.Test; import org.openscience.cdk.exception.CDKException; import org.openscience.cdk.fingerprint.Fingerprinter; import org.openscience.cdk.isomorphism.UniversalIsomorphismTester; public class FingerprintsTest { static final Logger log = Logger.getAnonymousLogger(); String BF5_MDL2 = "BF5\n" + " -ISIS- 3D\n" + "\n" + " 26 29 0 0 0 0 0 0 0 0 1 V2000\n" + " 1.3870 -0.9840 -0.0020 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 1.4580 3.3830 -1.1290 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 1.0220 0.3780 0.1280 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 0.3730 -2.0890 -0.0500 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -0.2940 0.7270 0.2360 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 1.6240 2.7830 0.2690 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 0.2990 2.8850 1.0310 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 5.4950 -1.7360 -0.3370 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 5.9860 0.2980 0.0110 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 3.0790 -2.6160 -0.2080 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -0.6550 2.0360 0.3820 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -1.0310 -1.5910 0.0620 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -1.3360 -0.2980 0.1960 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -3.2600 1.0700 -0.6250 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 5.1450 -0.5810 -0.1270 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -2.6810 0.0840 0.3000 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 3.3150 1.0580 0.0750 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 3.7210 -0.2520 -0.0410 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -5.4360 0.0600 -0.1210 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 2.7060 -1.3170 -0.0860 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 2.0030 1.3690 0.1540 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -3.5490 -0.4860 1.3420 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -4.8260 -1.0120 0.6780 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -6.7360 -0.3450 -0.6290 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -4.5480 0.4860 -1.2160 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -2.0390 -2.4900 0.0320 F 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 12 13 2 0 0 0 0\n" + " 5 3 2 0 0 0 0\n" + " 1 20 2 0 0 0 0\n" + " 17 18 2 0 0 0 0\n" + " 15 9 2 0 0 0 0\n" + " 12 4 1 0 0 0 0\n" + " 13 5 1 0 0 0 0\n" + " 4 1 1 0 0 0 0\n" + " 7 6 1 0 0 0 0\n" + " 6 2 1 0 0 0 0\n" + " 1 3 1 0 0 0 0\n" + " 18 20 1 0 0 0 0\n" + " 18 15 1 0 0 0 0\n" + " 22 23 1 0 0 0 0\n" + " 25 14 1 0 0 0 0\n" + " 12 26 1 0 0 0 0\n" + " 13 16 1 0 0 0 0\n" + " 22 16 1 0 0 0 0\n" + " 14 16 1 0 0 0 0\n" + " 6 21 1 0 0 0 0\n" + " 3 21 1 0 0 0 0\n" + " 17 21 1 0 0 0 0\n" + " 23 19 1 0 0 0 0\n" + " 25 19 1 0 0 0 0\n" + " 5 11 1 0 0 0 0\n" + " 7 11 1 0 0 0 0\n" + " 20 10 1 0 0 0 0\n" + " 15 8 1 0 0 0 0\n" + " 19 24 1 0 0 0 0\n" + "M CHG 1 8 -1\n" + "\n" + "M END\n"; String BF5_MDL3 = "BF5\n" + " -ISIS- 3D\n" + "\n" + " 26 29 0 0 0 0 0 0 0 0 1 V2000\n" + " 1.3870 -0.9840 -0.0020 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 1.4580 3.3830 -1.1290 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 1.0220 0.3780 0.1280 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 0.3730 -2.0890 -0.0500 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -0.2940 0.7270 0.2360 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 1.6240 2.7830 0.2690 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 0.2990 2.8850 1.0310 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 5.4950 -1.7360 -0.3370 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 5.9860 0.2980 0.0110 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 3.0790 -2.6160 -0.2080 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -0.6550 2.0360 0.3820 O 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -1.0310 -1.5910 0.0620 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -1.3360 -0.2980 0.1960 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -3.2600 1.0700 -0.6250 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 5.1450 -0.5810 -0.1270 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -2.6810 0.0840 0.3000 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 3.3150 1.0580 0.0750 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 3.7210 -0.2520 -0.0410 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -5.4360 0.0600 -0.1210 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 2.7060 -1.3170 -0.0860 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 2.0030 1.3690 0.1540 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -3.5490 -0.4860 1.3420 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -4.8260 -1.0120 0.6780 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -6.7360 -0.3450 -0.6290 N 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -4.5480 0.4860 -1.2160 C 0 0 0 0 0 0 0 0 0 0 0 0\n" + " -2.0390 -2.4900 0.0320 F 0 0 0 0 0 0 0 0 0 0 0 0\n" + " 12 26 1 0 0 0 0\n" + " 12 13 2 0 0 0 0\n" + " 12 4 1 0 0 0 0\n" + " 13 16 1 0 0 0 0\n" + " 22 16 1 0 0 0 0\n" + " 14 16 1 0 0 0 0\n" + " 5 11 1 0 0 0 0\n" + " 7 11 1 0 0 0 0\n" + " 13 5 1 0 0 0 0\n" + " 6 21 1 0 0 0 0\n" + " 3 21 1 0 0 0 0\n" + " 17 21 1 0 0 0 0\n" + " 20 10 1 0 0 0 0\n" + " 5 3 2 0 0 0 0\n" + " 19 24 1 0 0 0 0\n" + " 23 19 1 0 0 0 0\n" + " 25 19 1 0 0 0 0\n" + " 15 9 2 0 0 0 0\n" + " 4 1 1 0 0 0 0\n" + " 15 8 1 0 0 0 0\n" + " 7 6 1 0 0 0 0\n" + " 6 2 1 0 0 0 0\n" + " 1 3 1 0 0 0 0\n" + " 1 20 2 0 0 0 0\n" + " 17 18 2 0 0 0 0\n" + " 18 20 1 0 0 0 0\n" + " 18 15 1 0 0 0 0\n" + " 22 23 1 0 0 0 0\n" + " 25 14 1 0 0 0 0\n" + "M CHG 1 8 -1\n" + "\n" + "M END\n"; @Test public void cdktestBF5() throws IOException, CDKException { StringReader mdl1 = new StringReader(BF5_MDL3); org.openscience.cdk.io.MDLReader mdl1Reader = new org.openscience.cdk.io.MDLReader(mdl1); StringReader mdl2 = new StringReader(BF5_MDL2); org.openscience.cdk.io.MDLReader mdl2Reader = new org.openscience.cdk.io.MDLReader(mdl2); org.openscience.cdk.Molecule molMdl1 = new org.openscience.cdk.Molecule(); mdl1Reader.read(molMdl1); org.openscience.cdk.Molecule molMdl2 = new org.openscience.cdk.Molecule(); mdl2Reader.read(molMdl2); assertTrue(UniversalIsomorphismTester.isIsomorph(molMdl1, molMdl2)); Fingerprinter finger = new Fingerprinter(); BitSet fingerprint1 = finger.getFingerprint(molMdl1); finger = new Fingerprinter(); BitSet fingerprint2 = finger.getFingerprint(molMdl2); assertEquals(fingerprint1, fingerprint2); } } Regards, Adel. Egon Willighagen wrote:On Thu, Sep 30, 2010 at 5:08 PM, Adel Golovin <[email protected]> wrote:cdk-1.3.4Good. On the bus home I wrote four new unit tests for the fingerprinter, and using the AtomContainerAtomPermutor and AtomContainerBondPermutor on two molecules, and that showed no problems. If you send me your file I can use that for a test too (which I would upload to the repository, if you are OK with that...) Egon------------------------------------------------------------------------------ Virtualization is moving to the mainstream and overtaking non-virtualized environment for deploying applications. Does it make network security easier or more difficult to achieve? Read this whitepaper to separate the two and get a better understanding. http://p.sf.net/sfu/hp-phase2-d2d _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user
BF5.cif
Description: application/vnd.multiad.creator.cif
------------------------------------------------------------------------------ Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today. http://p.sf.net/sfu/beautyoftheweb
_______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

