On Wed, Nov 24, 2010 at 3:05 PM, Thomas Strunz <beginn...@hotmail.de> wrote:
> With my arbitrary query structure I get 1586 hits using Fingerprinter and
> 1582 using ExtendedFingerprinter, both with default settings. Not to bad but
> now I actually never compared if these 1582 are all contained in the 1586
> just assumed it which is naive of course.

I took a quick look at CDK fingerprint accuracy. Some testing code is
up at https://gist.github.com/718099 and the test is based around
checking that the fingerprint of a fragment is a subset of the
fingerprint of the parent that the fragment is derived from. I
generated a set of fragments from a subset of DrugBank small molecules
- the data file is at https://gist.github.com/718123 and the format is

fragment_smiles parent_smiles molecule_id

For the extended fingerprinter, the accuracy is 99% - out of 1144
tests, 11 fail, but the 'failures' are, as far as I can see, true
negatives (primarily due to fragments having explicit hydrogens added
as part of the fragmentation procedure). The standard fingerprinter
also gives the same results. The graph fingerprinter, on the other
hand, has 100% accuracy

-- 
Rajarshi Guha
NIH Chemical Genomics Center

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to