Currently, I prefer fingerprint based clustering, because it's hard to set the cutoff for scaffold based clustering. Does RDKit have scaffold based clustering?
On Sat, Aug 22, 2015 at 10:56 PM, <abhik1...@gmail.com> wrote: > Hi, how about scaffold based clustering . You extract the scaffolds and > then cluster it and then put the respective scaffold compounds inside the > cluster . > > Sent from my iPhone > > > On Aug 22, 2015, at 8:43 PM, Jing Lu <ajin...@gmail.com> wrote: > > > > Dear RDKit users, > > > > If I want to cluster more than 1M molecules by ECFP4. How could I do it? > If I calculate the distance between every pair of molecules, the size of > distance matrix will be too big. Does RDKit support any heuristic > clustering algorithm without calculating the distance matrix of the whole > library? > > > > > > > > Thanks, > > Jing > > > ------------------------------------------------------------------------------ > > _______________________________________________ > > Rdkit-discuss mailing list > > Rdkit-discuss@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss