Hi,
I have performed a similarity matrix calculation of 4176 X 4016 molecules with
a program using the RDKit and it took 401 seconds. The same program with the
same sets of molecules and using the Daylight toolkit took 19 seconds.
Has anybody observed similar results? The main difference in time seems to come
from the Tanimoto similarity calculation (although the fingerprint generation
is also slower). I'm concerned about the impact in e.g. clustering algorithms
with large datasets.
I used the defaults for fingerprint generation in RDKit. Not sure if something
can be done to improve this.
Thanks a lot,
Gonzalo
------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss