On Wed, Nov 24, 2010 at 3:05 PM, Thomas Strunz <beginn...@hotmail.de> wrote: > With my arbitrary query structure I get 1586 hits using Fingerprinter and > 1582 using ExtendedFingerprinter, both with default settings. Not to bad but > now I actually never compared if these 1582 are all contained in the 1586 > just assumed it which is naive of course.
I took a quick look at CDK fingerprint accuracy. Some testing code is up at https://gist.github.com/718099 and the test is based around checking that the fingerprint of a fragment is a subset of the fingerprint of the parent that the fragment is derived from. I generated a set of fragments from a subset of DrugBank small molecules - the data file is at https://gist.github.com/718123 and the format is fragment_smiles parent_smiles molecule_id For the extended fingerprinter, the accuracy is 99% - out of 1144 tests, 11 fail, but the 'failures' are, as far as I can see, true negatives (primarily due to fragments having explicit hydrogens added as part of the fragmentation procedure). The standard fingerprinter also gives the same results. The graph fingerprinter, on the other hand, has 100% accuracy -- Rajarshi Guha NIH Chemical Genomics Center ------------------------------------------------------------------------------ Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user