[Rdkit-discuss] speed of Tanimoto similarity calculations

Gonzalo Colmenarejo-Sanchez Fri, 20 Apr 2012 01:29:39 -0700

Hi,

I have performed a similarity matrix calculation of 4176 X 4016 molecules with 
a program using the RDKit and it took 401 seconds. The same program with the 
same sets of molecules and using the Daylight toolkit took 19 seconds.


Has anybody observed similar results? The main difference in time seems to come 
from the Tanimoto similarity calculation (although the fingerprint generation 
is also slower). I'm concerned about the impact in e.g. clustering algorithms 
with large datasets.

I used the defaults for fingerprint generation in RDKit. Not sure if something 
can be done to improve this.

Thanks a lot,

Gonzalo

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] speed of Tanimoto similarity calculations

Reply via email to