Hi,

Please do not duplicate questions/posts between the mailing list and github
discussions. That's spamming the community.

-greg


On Tue, Apr 23, 2024 at 4:10 PM Ariadna Llop Peiró <ariadnall...@gmail.com>
wrote:

> Hello everyone,
>
> I'm currently working with a dataset of chemical compounds, aiming to
> cluster them into different series to create a 3D-QSAR model. Up to this
> point, I've been using Morgan Fingerprints to generate the descriptors and
> cluster the compounds based on their Tanimoto Similarity:
>
> ```
> # Generate fingerprint descriptor database
> fps = [AllChem.GetMorganFingerprintAsBitVect(m, 2) for m in mols]
>
>
> # Calculate pairwise Tanimoto similarity between fingerprints
> similarity_matrix = []
> for i in range(len(fps)):
>     similarities = []
>     for j in range(len(fps)):
>         similarities.append(DataStructs.TanimotoSimilarity(fps[i], fps[j]))
>
>     similarity_matrix.append(similarities)
> ```
>
>
> With the similarity matrix, I applied hierarchical clustering based on a
> Tanimoto Similarity threshold to group similar compounds:
>
> ```
> # Cluster based on Tanimoto similarity
> dists = 1 - np.array(similarity_matrix)
> hc = hierarchy.linkage(squareform(dists), method='single')
>
> # Specify a distance threshold or number of clusters
> threshold = 0.6  # Adjust this value based on your dendrogram and
> similarity values
> clusters = hierarchy.fcluster(hc, threshold, criterion='distance')
> ```
>
> However, I'm not satisfied with the results and would like to experiment
> with MACCS Keys to see if they yield better clustering outcomes. Does
> anyone know how to cluster compounds using MACCS fingerprints? Any insights
> on the best approach to calculate similarities and cluster using these
> fingerprints would be highly appreciated.
>
> Thank you in advance for your suggestions!
>
> Ariadna Llop
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to