Thanks. Interestingly, both molecules get canonicalized to the same SMILES.
But the fingerprinter gives different numbers of paths for the two moleules Can you send me the original SD files? On Mon, Oct 4, 2010 at 4:04 PM, Adel Golovin <[email protected]> wrote: > Hi Rajarshi, > I attach here the java file with the structures : BF5_MDL2 and BF5_MDL3 > > This java file contails other tests too, which you do not need. > To compile it remover everything except cdktestBF5 > > Best regards, > Adel. > > Rajarshi Guha wrote: >> >> Could you send the structures as attachments? Gmail is mangling the lines >> >> On Mon, Oct 4, 2010 at 11:59 AM, Adel Golovin <[email protected]> wrote: >> >>> >>> Hi Rajarshi, >>> >>> In the test example both structures have single or double bonds. >>> There is no aromaticity given and this particular structure (BF5) does >>> not >>> have aromatic rings. >>> The problem is that the isomorphism test gives a positive result whether >>> the >>> fingerprints are different. >>> The difference between the data is that in the first one bonds are >>> ordered >>> by the bonds order (double go first, then single) and in the second there >>> is >>> no order. >>> >>> Best regards, >>> Adel. >>> >>> Rajarshi Guha wrote: >>> >>>> >>>> Hmm, does this problem get resolved if you explicitly perceive >>>> aromaticity? The hashed FP considers aromatic bonds and so bond >>>> ordering shouldn't affect the fp (but will if it only sees >>>> single/double bonds) >>>> >>>> On Mon, Oct 4, 2010 at 10:46 AM, Adel Golovin <[email protected]> wrote: >>>> >>>> >>>>> >>>>> Hi Egon, >>>>> thank for looking into it. >>>>> The example with 1FH concern the aromaticity test and I'm going to >>>>> investigate it further. >>>>> >>>>> Consider different fingerprints for isomorphic structures: >>>>> The structure is BF5: >>>>> >>>>> http://www.ebi.ac.uk/pdbe-site/pdbemotif/chem/detail.jsp?code=BF5 >>>>> >>>>> The code below has two different MDL files representing the same >>>>> structure. >>>>> I tested these structures whether they are isomorphic using the >>>>> following code and generate the same fingerprints. >>>>> The result is that they are isomorphic but generate different >>>>> fingerprints. >>>>> The only difference between them is the order of bonds. >>>>> >>>>> the code: >>>>> import java.io.IOException; >>>>> import java.io.Reader; >>>>> import java.io.StringReader; >>>>> import java.util.Arrays; >>>>> import java.util.BitSet; >>>>> import java.util.Comparator; >>>>> import java.util.logging.Logger; >>>>> >>>>> import org.junit.Test; >>>>> import org.openscience.cdk.exception.CDKException; >>>>> import org.openscience.cdk.fingerprint.Fingerprinter; >>>>> import org.openscience.cdk.isomorphism.UniversalIsomorphismTester; >>>>> >>>>> public class FingerprintsTest { >>>>> static final Logger log = Logger.getAnonymousLogger(); >>>>> String BF5_MDL2 = >>>>> "BF5\n" + >>>>> " -ISIS- 3D\n" + >>>>> "\n" + >>>>> " 26 29 0 0 0 0 0 0 0 0 1 V2000\n" + >>>>> " 1.3870 -0.9840 -0.0020 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 1.4580 3.3830 -1.1290 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 1.0220 0.3780 0.1280 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 0.3730 -2.0890 -0.0500 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -0.2940 0.7270 0.2360 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 1.6240 2.7830 0.2690 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 0.2990 2.8850 1.0310 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 5.4950 -1.7360 -0.3370 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 5.9860 0.2980 0.0110 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 3.0790 -2.6160 -0.2080 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -0.6550 2.0360 0.3820 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -1.0310 -1.5910 0.0620 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -1.3360 -0.2980 0.1960 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -3.2600 1.0700 -0.6250 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 5.1450 -0.5810 -0.1270 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -2.6810 0.0840 0.3000 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 3.3150 1.0580 0.0750 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 3.7210 -0.2520 -0.0410 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -5.4360 0.0600 -0.1210 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 2.7060 -1.3170 -0.0860 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 2.0030 1.3690 0.1540 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -3.5490 -0.4860 1.3420 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -4.8260 -1.0120 0.6780 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -6.7360 -0.3450 -0.6290 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -4.5480 0.4860 -1.2160 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -2.0390 -2.4900 0.0320 F 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 12 13 2 0 0 0 0\n" + >>>>> " 5 3 2 0 0 0 0\n" + >>>>> " 1 20 2 0 0 0 0\n" + >>>>> " 17 18 2 0 0 0 0\n" + >>>>> " 15 9 2 0 0 0 0\n" + >>>>> " 12 4 1 0 0 0 0\n" + >>>>> " 13 5 1 0 0 0 0\n" + >>>>> " 4 1 1 0 0 0 0\n" + >>>>> " 7 6 1 0 0 0 0\n" + >>>>> " 6 2 1 0 0 0 0\n" + >>>>> " 1 3 1 0 0 0 0\n" + >>>>> " 18 20 1 0 0 0 0\n" + >>>>> " 18 15 1 0 0 0 0\n" + >>>>> " 22 23 1 0 0 0 0\n" + >>>>> " 25 14 1 0 0 0 0\n" + >>>>> " 12 26 1 0 0 0 0\n" + >>>>> " 13 16 1 0 0 0 0\n" + >>>>> " 22 16 1 0 0 0 0\n" + >>>>> " 14 16 1 0 0 0 0\n" + >>>>> " 6 21 1 0 0 0 0\n" + >>>>> " 3 21 1 0 0 0 0\n" + >>>>> " 17 21 1 0 0 0 0\n" + >>>>> " 23 19 1 0 0 0 0\n" + >>>>> " 25 19 1 0 0 0 0\n" + >>>>> " 5 11 1 0 0 0 0\n" + >>>>> " 7 11 1 0 0 0 0\n" + >>>>> " 20 10 1 0 0 0 0\n" + >>>>> " 15 8 1 0 0 0 0\n" + >>>>> " 19 24 1 0 0 0 0\n" + >>>>> "M CHG 1 8 -1\n" + >>>>> "\n" + >>>>> "M END\n"; >>>>> >>>>> >>>>> String BF5_MDL3 = >>>>> "BF5\n" + >>>>> " -ISIS- 3D\n" + >>>>> "\n" + >>>>> " 26 29 0 0 0 0 0 0 0 0 1 V2000\n" + >>>>> " 1.3870 -0.9840 -0.0020 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 1.4580 3.3830 -1.1290 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 1.0220 0.3780 0.1280 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 0.3730 -2.0890 -0.0500 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -0.2940 0.7270 0.2360 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 1.6240 2.7830 0.2690 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 0.2990 2.8850 1.0310 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 5.4950 -1.7360 -0.3370 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 5.9860 0.2980 0.0110 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 3.0790 -2.6160 -0.2080 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -0.6550 2.0360 0.3820 O 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -1.0310 -1.5910 0.0620 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -1.3360 -0.2980 0.1960 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -3.2600 1.0700 -0.6250 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 5.1450 -0.5810 -0.1270 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -2.6810 0.0840 0.3000 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 3.3150 1.0580 0.0750 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 3.7210 -0.2520 -0.0410 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -5.4360 0.0600 -0.1210 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 2.7060 -1.3170 -0.0860 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 2.0030 1.3690 0.1540 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -3.5490 -0.4860 1.3420 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -4.8260 -1.0120 0.6780 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -6.7360 -0.3450 -0.6290 N 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -4.5480 0.4860 -1.2160 C 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " -2.0390 -2.4900 0.0320 F 0 0 0 0 0 0 0 0 0 0 0 >>>>> 0\n" + >>>>> " 12 26 1 0 0 0 0\n" + >>>>> " 12 13 2 0 0 0 0\n" + >>>>> " 12 4 1 0 0 0 0\n" + >>>>> " 13 16 1 0 0 0 0\n" + >>>>> " 22 16 1 0 0 0 0\n" + >>>>> " 14 16 1 0 0 0 0\n" + >>>>> " 5 11 1 0 0 0 0\n" + >>>>> " 7 11 1 0 0 0 0\n" + >>>>> " 13 5 1 0 0 0 0\n" + >>>>> " 6 21 1 0 0 0 0\n" + >>>>> " 3 21 1 0 0 0 0\n" + >>>>> " 17 21 1 0 0 0 0\n" + >>>>> " 20 10 1 0 0 0 0\n" + >>>>> " 5 3 2 0 0 0 0\n" + >>>>> " 19 24 1 0 0 0 0\n" + >>>>> " 23 19 1 0 0 0 0\n" + >>>>> " 25 19 1 0 0 0 0\n" + >>>>> " 15 9 2 0 0 0 0\n" + >>>>> " 4 1 1 0 0 0 0\n" + >>>>> " 15 8 1 0 0 0 0\n" + >>>>> " 7 6 1 0 0 0 0\n" + >>>>> " 6 2 1 0 0 0 0\n" + >>>>> " 1 3 1 0 0 0 0\n" + >>>>> " 1 20 2 0 0 0 0\n" + >>>>> " 17 18 2 0 0 0 0\n" + >>>>> " 18 20 1 0 0 0 0\n" + >>>>> " 18 15 1 0 0 0 0\n" + >>>>> " 22 23 1 0 0 0 0\n" + >>>>> " 25 14 1 0 0 0 0\n" + >>>>> "M CHG 1 8 -1\n" + >>>>> "\n" + >>>>> "M END\n"; >>>>> >>>>> @Test public void cdktestBF5() throws IOException, CDKException { >>>>> StringReader mdl1 = new StringReader(BF5_MDL3); >>>>> org.openscience.cdk.io.MDLReader mdl1Reader = new >>>>> org.openscience.cdk.io.MDLReader(mdl1); >>>>> StringReader mdl2 = new StringReader(BF5_MDL2); >>>>> org.openscience.cdk.io.MDLReader mdl2Reader = new >>>>> org.openscience.cdk.io.MDLReader(mdl2); >>>>> org.openscience.cdk.Molecule molMdl1 = new >>>>> org.openscience.cdk.Molecule(); >>>>> mdl1Reader.read(molMdl1); >>>>> org.openscience.cdk.Molecule molMdl2 = new >>>>> org.openscience.cdk.Molecule(); >>>>> mdl2Reader.read(molMdl2); >>>>> assertTrue(UniversalIsomorphismTester.isIsomorph(molMdl1, molMdl2)); >>>>> >>>>> Fingerprinter finger = new Fingerprinter(); >>>>> BitSet fingerprint1 = finger.getFingerprint(molMdl1); >>>>> finger = new Fingerprinter(); >>>>> BitSet fingerprint2 = finger.getFingerprint(molMdl2); >>>>> >>>>> assertEquals(fingerprint1, fingerprint2); >>>>> } >>>>> } >>>>> >>>>> Regards, >>>>> Adel. >>>>> >>>>> Egon Willighagen wrote: >>>>> >>>>> >>>>>> >>>>>> On Thu, Sep 30, 2010 at 5:08 PM, Adel Golovin <[email protected]> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> cdk-1.3.4 >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> Good. On the bus home I wrote four new unit tests for the >>>>>> fingerprinter, and using the AtomContainerAtomPermutor and >>>>>> AtomContainerBondPermutor on two molecules, and that showed no >>>>>> problems. If you send me your file I can use that for a test too >>>>>> (which I would upload to the repository, if you are OK with that...) >>>>>> >>>>>> Egon >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Virtualization is moving to the mainstream and overtaking >>>>> non-virtualized >>>>> environment for deploying applications. Does it make network security >>>>> easier or more difficult to achieve? Read this whitepaper to separate >>>>> the >>>>> two and get a better understanding. >>>>> http://p.sf.net/sfu/hp-phase2-d2d >>>>> _______________________________________________ >>>>> Cdk-user mailing list >>>>> [email protected] >>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >> >> >> >> > > -- Rajarshi Guha NIH Chemical Genomics Center ------------------------------------------------------------------------------ Virtualization is moving to the mainstream and overtaking non-virtualized environment for deploying applications. Does it make network security easier or more difficult to achieve? Read this whitepaper to separate the two and get a better understanding. http://p.sf.net/sfu/hp-phase2-d2d _______________________________________________ Cdk-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/cdk-user

