Dear RDKit users, If I want to cluster more than 1M molecules by ECFP4. How could I do it? If I calculate the distance between every pair of molecules, the size of distance matrix will be too big. Does RDKit support any heuristic clustering algorithm without calculating the distance matrix of the whole library?
Thanks, Jing
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss