Dear Markus,

On Wed, Jul 9, 2008 at 11:49 AM, markus <[email protected]> wrote:
> Hi Greg,
> I just played around with the Fingerprints an wondered if a Tversky
> Index could be implemented
> just in the way it is described by
> http://www.daylight.com/dayhtml/doc/theory/theory.finger.html
> Having alpha and beta values for optimizing the similarity score would
> be nice to  play with.

Good idea. And, very nicely, a good idea that's pretty easy to
implement. I just checked some changes in to the trunk to implement
the Tversky similarity metric:
http://rdkit.svn.sourceforge.net/viewvc/rdkit?view=rev&revision=757

Here's a demonstration:
[6]>>> fp1 = Chem.RDKFingerprint(m1)
[7]>>> fp2 = Chem.RDKFingerprint(m2)
[8]>>> import DataStructs
[9]>>> DataStructs.TanimotoSimilarity(fp1,fp2)
Out[9] 0.96130753835890592
[10]>>> DataStructs.TverskySimilarity(fp1,fp2,1,1)  # this is just
another way of doing the tanimoto metric
Out[10] 0.96130753835890592
[11]>>> DataStructs.TverskySimilarity(fp1,fp2,.5,.5) # this is the Dice metric
Out[11] 0.98027210884353744
[13]>>> DataStructs.DiceSimilarity(fp1,fp2)
Out[13] 0.98027210884353744

Note that the trunk in subversion now contains a lot of changes that
remove the use of the old Numeric python, so an svn update is going to
get you a lot of files. This was announced in this post:
http://sourceforge.net/mailarchive/message.php?msg_name=60825b0f0807050745t35af3acdj7142dc9a58119fc1%40mail.gmail.com

Best Regards,
-greg

Reply via email to