Hi, I managed to performe Taylor-Butina clustering on a dataset of 193 571 fragments retrieved from ZINC20. I used the indications in this link https://www.macinchem.org/reviews/clustering/clustering.php Actually, I've never used RDKit before and never did a cluster analysis, so I'm really new to this type of work. I've read the paper related to Taylor-Butina clustering (https://pubs.acs.org/doi/10.1021/ci9803381), but I don't understand if it can be considered a hierarchical method or not. Could someone help me understanding this? Moreover, I've got some problems generating the images after clustering. First, I don't know what images I need: if it's hierarchical I should do a dendrogram, but if it isn't hierchical there's no need (I think). I only managed to obtain the image of a sparse similarity matrix, but the RAM is too small to obtain a dense matrix. I wasn't able to do the plot of the clusters or to obtain the images of the moleculese that are centroids or false singletons (I've tried using RDKit to obtain images from fingerprints but the images of the molecules are strange). I have thousands of clusters and false singletons as results. Has someone done something like that in the past? Any suggestions? I gave me an explanation of what are false and true singletons (I obtain only false singletons, is that normal?), but I appreciate if someone more expert could explain me and confirm my guess. I'm sorry for all this questions, but I'm really new to this topic. Hope someone can help me, kind regards.
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss