Hi all,

I was trying to calculate a Similarity Matrix with Morgan Fingerprints and TanimotoSimilarity.

If I use DataStructs.TanimotoSimilarity(fp,fp) I get a Simlarity of 1, which I would expect. If I do the same with the DataStructs.BulkTanimotoSimilarity(fps[i], fps[i]) i get a Similarity of 0, which actually is not a Similarity but a distance. I took this from the cookbook (http://www.rdkit.org/docs/Cookbook.html) which states at the clustering example:

 # first generate the distance matrix:
    dists  =  []
    nfps  =  len(fps)
    for  i  in  range(1,nfps):
        sims  =  DataStructs.BulkTanimotoSimilarity(fps[i],fps[:i])
        dists.extend([1-x  for  x  in  sims])


Did I misunderstand something or is the dists list in the example actually a list of Similarities, and the BulkTanimotoSimilarity actually calculates the Tanimoto distance?

It would be great to get some clarification.

Thank you in advance,

Jennifer


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to