Dear Jing,
How about your trying using bayon ?
https://code.google.com/p/bayon/
It's not function of RDKit, but I think the library can cluster molecules
using ECFP4.
Unfortunately, input file format of bayon is not distance matrix but easy
to prepare the format.
Best regards.
Takayuki
Dear RDKit users,
If I want to cluster more than 1M molecules by ECFP4. How could I do it? If
I calculate the distance between every pair of molecules, the size of
distance matrix will be too big. Does RDKit support any heuristic
clustering algorithm without calculating the distance matrix of the
Just an FYI on this one: I just merged a Python DP and DT implementation
onto master.
Here's the github issue referencing the commits:
https://github.com/rdkit/rdkit/issues/574
I will try to get a C++ version done in time for the next release.
On Wed, Jul 15, 2015 at 11:02 AM, Greg Landrum
3 matches
Mail list logo