Hi Rajarshi,
The original data is from the CIF file.  This data is in the java code too.
I attach the cif file here.

Best regards,
Adel.

Rajarshi Guha wrote:
Thanks. Interestingly, both molecules get canonicalized to the same SMILES.

But the fingerprinter gives different numbers of paths for the two moleules

Can you send me the original SD files?

On Mon, Oct 4, 2010 at 4:04 PM, Adel Golovin <[email protected]> wrote:
Hi Rajarshi,
I attach here the java file with the structures : BF5_MDL2 and BF5_MDL3

This java file contails other tests too, which you  do not need.
To compile it remover everything  except cdktestBF5

Best regards,
Adel.

Rajarshi Guha wrote:
Could you send the structures as attachments? Gmail is mangling the lines

On Mon, Oct 4, 2010 at 11:59 AM, Adel Golovin <[email protected]> wrote:

Hi Rajarshi,

In the test example both structures have single or double bonds.
There is no aromaticity given and this particular structure (BF5) does
not
have aromatic rings.
The problem is that the isomorphism test gives a positive result whether
the
fingerprints are different.
The difference between the data is that in the first one bonds are
ordered
by the bonds order (double go first, then single) and in the second there
is
no order.

Best regards,
Adel.

Rajarshi Guha wrote:

Hmm, does this problem get resolved if you explicitly perceive
aromaticity? The hashed FP considers aromatic bonds and so bond
ordering shouldn't affect the fp (but will if it only sees
single/double bonds)

On Mon, Oct 4, 2010 at 10:46 AM, Adel Golovin <[email protected]> wrote:


Hi Egon,
thank for looking into it.
The example with 1FH concern the aromaticity test and I'm going to
investigate it further.

Consider different fingerprints for isomorphic structures:
The structure is BF5:

http://www.ebi.ac.uk/pdbe-site/pdbemotif/chem/detail.jsp?code=BF5

The code below has two different MDL files representing the same
structure.
I tested these structures whether they are isomorphic using the
following code and generate the same fingerprints.
The result is that they are isomorphic but generate different
fingerprints.
The only difference between them is the order of bonds.

the code:
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.util.Arrays;
import java.util.BitSet;
import java.util.Comparator;
import java.util.logging.Logger;

import org.junit.Test;
import org.openscience.cdk.exception.CDKException;
import org.openscience.cdk.fingerprint.Fingerprinter;
import org.openscience.cdk.isomorphism.UniversalIsomorphismTester;

public class FingerprintsTest {
 static final Logger log = Logger.getAnonymousLogger();
String BF5_MDL2 =
 "BF5\n" +
 "  -ISIS-            3D\n" +
 "\n" +
 " 26 29  0  0  0  0  0  0  0  0  1 V2000\n" +
 "    1.3870   -0.9840   -0.0020 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    1.4580    3.3830   -1.1290 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    1.0220    0.3780    0.1280 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    0.3730   -2.0890   -0.0500 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -0.2940    0.7270    0.2360 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    1.6240    2.7830    0.2690 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    0.2990    2.8850    1.0310 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    5.4950   -1.7360   -0.3370 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    5.9860    0.2980    0.0110 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    3.0790   -2.6160   -0.2080 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -0.6550    2.0360    0.3820 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -1.0310   -1.5910    0.0620 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -1.3360   -0.2980    0.1960 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -3.2600    1.0700   -0.6250 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    5.1450   -0.5810   -0.1270 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -2.6810    0.0840    0.3000 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    3.3150    1.0580    0.0750 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    3.7210   -0.2520   -0.0410 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -5.4360    0.0600   -0.1210 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    2.7060   -1.3170   -0.0860 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    2.0030    1.3690    0.1540 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -3.5490   -0.4860    1.3420 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -4.8260   -1.0120    0.6780 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -6.7360   -0.3450   -0.6290 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -4.5480    0.4860   -1.2160 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -2.0390   -2.4900    0.0320 F   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 " 12 13  2  0  0  0  0\n" +
 "  5  3  2  0  0  0  0\n" +
 "  1 20  2  0  0  0  0\n" +
 " 17 18  2  0  0  0  0\n" +
 " 15  9  2  0  0  0  0\n" +
 " 12  4  1  0  0  0  0\n" +
 " 13  5  1  0  0  0  0\n" +
 "  4  1  1  0  0  0  0\n" +
 "  7  6  1  0  0  0  0\n" +
 "  6  2  1  0  0  0  0\n" +
 "  1  3  1  0  0  0  0\n" +
 " 18 20  1  0  0  0  0\n" +
 " 18 15  1  0  0  0  0\n" +
 " 22 23  1  0  0  0  0\n" +
 " 25 14  1  0  0  0  0\n" +
 " 12 26  1  0  0  0  0\n" +
 " 13 16  1  0  0  0  0\n" +
 " 22 16  1  0  0  0  0\n" +
 " 14 16  1  0  0  0  0\n" +
 "  6 21  1  0  0  0  0\n" +
 "  3 21  1  0  0  0  0\n" +
 " 17 21  1  0  0  0  0\n" +
 " 23 19  1  0  0  0  0\n" +
 " 25 19  1  0  0  0  0\n" +
 "  5 11  1  0  0  0  0\n" +
 "  7 11  1  0  0  0  0\n" +
 " 20 10  1  0  0  0  0\n" +
 " 15  8  1  0  0  0  0\n" +
 " 19 24  1  0  0  0  0\n" +
 "M  CHG  1   8  -1\n" +
 "\n" +
 "M  END\n";


String BF5_MDL3 =
 "BF5\n" +
 "  -ISIS-            3D\n" +
 "\n" +
 " 26 29  0  0  0  0  0  0  0  0  1 V2000\n" +
 "    1.3870   -0.9840   -0.0020 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    1.4580    3.3830   -1.1290 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    1.0220    0.3780    0.1280 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    0.3730   -2.0890   -0.0500 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -0.2940    0.7270    0.2360 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    1.6240    2.7830    0.2690 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    0.2990    2.8850    1.0310 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    5.4950   -1.7360   -0.3370 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    5.9860    0.2980    0.0110 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    3.0790   -2.6160   -0.2080 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -0.6550    2.0360    0.3820 O   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -1.0310   -1.5910    0.0620 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -1.3360   -0.2980    0.1960 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -3.2600    1.0700   -0.6250 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    5.1450   -0.5810   -0.1270 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -2.6810    0.0840    0.3000 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    3.3150    1.0580    0.0750 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    3.7210   -0.2520   -0.0410 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -5.4360    0.0600   -0.1210 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    2.7060   -1.3170   -0.0860 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "    2.0030    1.3690    0.1540 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -3.5490   -0.4860    1.3420 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -4.8260   -1.0120    0.6780 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -6.7360   -0.3450   -0.6290 N   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -4.5480    0.4860   -1.2160 C   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 "   -2.0390   -2.4900    0.0320 F   0  0  0  0  0  0  0  0  0  0  0
0\n" +
 " 12 26  1  0  0  0  0\n" +
 " 12 13  2  0  0  0  0\n" +
 " 12  4  1  0  0  0  0\n" +
 " 13 16  1  0  0  0  0\n" +
 " 22 16  1  0  0  0  0\n" +
 " 14 16  1  0  0  0  0\n" +
 "  5 11  1  0  0  0  0\n" +
 "  7 11  1  0  0  0  0\n" +
 " 13  5  1  0  0  0  0\n" +
 "  6 21  1  0  0  0  0\n" +
 "  3 21  1  0  0  0  0\n" +
 " 17 21  1  0  0  0  0\n" +
 " 20 10  1  0  0  0  0\n" +
 "  5  3  2  0  0  0  0\n" +
 " 19 24  1  0  0  0  0\n" +
 " 23 19  1  0  0  0  0\n" +
 " 25 19  1  0  0  0  0\n" +
 " 15  9  2  0  0  0  0\n" +
 "  4  1  1  0  0  0  0\n" +
 " 15  8  1  0  0  0  0\n" +
 "  7  6  1  0  0  0  0\n" +
 "  6  2  1  0  0  0  0\n" +
 "  1  3  1  0  0  0  0\n" +
 "  1 20  2  0  0  0  0\n" +
 " 17 18  2  0  0  0  0\n" +
 " 18 20  1  0  0  0  0\n" +
 " 18 15  1  0  0  0  0\n" +
 " 22 23  1  0  0  0  0\n" +
 " 25 14  1  0  0  0  0\n" +
 "M  CHG  1   8  -1\n" +
 "\n" +
 "M  END\n";

@Test public void cdktestBF5() throws IOException, CDKException {
 StringReader mdl1 = new StringReader(BF5_MDL3);
 org.openscience.cdk.io.MDLReader mdl1Reader = new
org.openscience.cdk.io.MDLReader(mdl1);
 StringReader mdl2 = new StringReader(BF5_MDL2);
 org.openscience.cdk.io.MDLReader mdl2Reader = new
org.openscience.cdk.io.MDLReader(mdl2);
 org.openscience.cdk.Molecule molMdl1 = new
org.openscience.cdk.Molecule();
 mdl1Reader.read(molMdl1);
 org.openscience.cdk.Molecule molMdl2 = new
org.openscience.cdk.Molecule();
 mdl2Reader.read(molMdl2);
 assertTrue(UniversalIsomorphismTester.isIsomorph(molMdl1, molMdl2));

 Fingerprinter finger = new Fingerprinter();
 BitSet fingerprint1 = finger.getFingerprint(molMdl1);
 finger = new Fingerprinter();
 BitSet fingerprint2 = finger.getFingerprint(molMdl2);

 assertEquals(fingerprint1, fingerprint2);
}
}

Regards,
Adel.

Egon Willighagen wrote:


On Thu, Sep 30, 2010 at 5:08 PM, Adel Golovin <[email protected]>
wrote:



cdk-1.3.4



Good. On the bus home I wrote four new unit tests for the
fingerprinter, and using the AtomContainerAtomPermutor and
AtomContainerBondPermutor on two molecules, and that showed no
problems. If you send me your file I can use that for a test too
(which I would upload to the repository, if you are OK with that...)

Egon




------------------------------------------------------------------------------
Virtualization is moving to the mainstream and overtaking
non-virtualized
environment for deploying applications. Does it make network security
easier or more difficult to achieve? Read this whitepaper to separate
the
two and get a better understanding.
http://p.sf.net/sfu/hp-phase2-d2d
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user










Attachment: BF5.cif
Description: application/vnd.multiad.creator.cif

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to