Dear Markus, On Wed, Jul 9, 2008 at 11:49 AM, markus <[email protected]> wrote: > Hi Greg, > I just played around with the Fingerprints an wondered if a Tversky > Index could be implemented > just in the way it is described by > http://www.daylight.com/dayhtml/doc/theory/theory.finger.html > Having alpha and beta values for optimizing the similarity score would > be nice to play with.
Good idea. And, very nicely, a good idea that's pretty easy to implement. I just checked some changes in to the trunk to implement the Tversky similarity metric: http://rdkit.svn.sourceforge.net/viewvc/rdkit?view=rev&revision=757 Here's a demonstration: [6]>>> fp1 = Chem.RDKFingerprint(m1) [7]>>> fp2 = Chem.RDKFingerprint(m2) [8]>>> import DataStructs [9]>>> DataStructs.TanimotoSimilarity(fp1,fp2) Out[9] 0.96130753835890592 [10]>>> DataStructs.TverskySimilarity(fp1,fp2,1,1) # this is just another way of doing the tanimoto metric Out[10] 0.96130753835890592 [11]>>> DataStructs.TverskySimilarity(fp1,fp2,.5,.5) # this is the Dice metric Out[11] 0.98027210884353744 [13]>>> DataStructs.DiceSimilarity(fp1,fp2) Out[13] 0.98027210884353744 Note that the trunk in subversion now contains a lot of changes that remove the use of the old Numeric python, so an svn update is going to get you a lot of files. This was announced in this post: http://sourceforge.net/mailarchive/message.php?msg_name=60825b0f0807050745t35af3acdj7142dc9a58119fc1%40mail.gmail.com Best Regards, -greg

