Hi, Please do not duplicate questions/posts between the mailing list and github discussions. That's spamming the community.
-greg On Tue, Apr 23, 2024 at 4:10 PM Ariadna Llop Peiró <ariadnall...@gmail.com> wrote: > Hello everyone, > > I'm currently working with a dataset of chemical compounds, aiming to > cluster them into different series to create a 3D-QSAR model. Up to this > point, I've been using Morgan Fingerprints to generate the descriptors and > cluster the compounds based on their Tanimoto Similarity: > > ``` > # Generate fingerprint descriptor database > fps = [AllChem.GetMorganFingerprintAsBitVect(m, 2) for m in mols] > > > # Calculate pairwise Tanimoto similarity between fingerprints > similarity_matrix = [] > for i in range(len(fps)): > similarities = [] > for j in range(len(fps)): > similarities.append(DataStructs.TanimotoSimilarity(fps[i], fps[j])) > > similarity_matrix.append(similarities) > ``` > > > With the similarity matrix, I applied hierarchical clustering based on a > Tanimoto Similarity threshold to group similar compounds: > > ``` > # Cluster based on Tanimoto similarity > dists = 1 - np.array(similarity_matrix) > hc = hierarchy.linkage(squareform(dists), method='single') > > # Specify a distance threshold or number of clusters > threshold = 0.6 # Adjust this value based on your dendrogram and > similarity values > clusters = hierarchy.fcluster(hc, threshold, criterion='distance') > ``` > > However, I'm not satisfied with the results and would like to experiment > with MACCS Keys to see if they yield better clustering outcomes. Does > anyone know how to cluster compounds using MACCS fingerprints? Any insights > on the best approach to calculate similarities and cluster using these > fingerprints would be highly appreciated. > > Thank you in advance for your suggestions! > > Ariadna Llop > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss