On May 27, 2011, at 6:01 AM, Greg Landrum wrote:
> And now a more philosophical point about this.
   ..
> It seems like it would make a lot more sense for all of us if we had a
> truly open definition.


In my followup I said I was working on a fingerprinting
tool. It's now available! You can download it from

  http://code.google.com/p/chem-fingerprints/
or directly at
  
http://code.google.com/p/chem-fingerprints/downloads/detail?name=chemfp-1.0a1.tgz

It supports RDKit, OpenBabel, and OEChem, with the command-line
programs rdkit2fps, ob2fps, and oe2fps respectively. Support
here includes support for the platform-dependent fingerprinters.

I've also developed platform-independent definitions for RDKit's
MACCS patterns and for substructure patterns heavily based on
the published PubChem/CACTVS fingerprints. I've also implemented
support for these patterns for each of the toolkits, as well as
a validation suite of about 400 structures and expected bits.
(And let me say, that alone took about 30 hours of very tedious
work!)

The distribution includes the command-line tool "sdf2fps" to
extract fingerprints from an SD file. It can handle many
fingerprint formats, including the --pubchem shortcut to extract
the PubChem fingerprints. (Note: this tool does not require a
third-pary chemistry toolkit.)


Finally, there's the command-line tool "simsearch" which
does Tanimoto similarity searching. If the query file is a
structure file then it knows how to examine the metadata
in the target fingerprint file to get the correct
fingerprinter.

The code is written in Python, with a C extension for
performance. There are no pre-built distributions.

I'm looking for feedback about the code. Let me know
if it does what you want, and if it doesn't, the let
me know what you want.


                                Andrew
                                da...@dalkescientific.com



------------------------------------------------------------------------------
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to