Hi,
I managed to performe Taylor-Butina clustering on a dataset of 193 571 
fragments retrieved from ZINC20.
I used the indications in this link 
https://www.macinchem.org/reviews/clustering/clustering.php
Actually, I've never used RDKit before and never did a cluster analysis, so I'm 
really new to this type of work. I've read the paper related to Taylor-Butina 
clustering (https://pubs.acs.org/doi/10.1021/ci9803381), but I don't understand 
if it can be considered a hierarchical method or not.
Could someone help me understanding this?
Moreover, I've got some problems generating the images after clustering.
First, I don't know what images I need: if it's hierarchical I should do a 
dendrogram, but if it isn't hierchical there's no need (I think).
I only managed to obtain the image of a sparse similarity matrix, but the RAM 
is too small to obtain a dense matrix.
I wasn't able to do the plot of the clusters or to obtain the images of the 
moleculese that are centroids or false singletons (I've tried using RDKit to 
obtain images from fingerprints but the images of the molecules are strange). I 
have thousands of clusters and false singletons as results.
Has someone done something like that in the past? Any suggestions?
I gave me an explanation of what are false and true singletons (I obtain only 
false singletons, is that normal?), but I appreciate if someone more expert 
could explain me and confirm my guess.
I'm sorry for all this questions, but I'm really new to this topic.
Hope someone can help me,
kind regards.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
  • [Rdkit-discuss]... Francesca Magarotto - francesca.magarot...@studio.unibo.it

Reply via email to