Hi Leon,

I'm not sure I understand the question. The clustering code returns a tuple
of indices of the clusters. Those indices are relative to the indexing of
the distance matrix. The `ClusterData` function doesn't know what you're
clustering, so there's no way it could know anything about cluster IDs.

In your case, the way to get the conformer IDs of the conformers in the
first cluster would be something like (not tested):

confs = mh.GetConformers()
print([confs[x].GetId() for x in clusters_a[0]])

-greg



On Mon, Nov 25, 2019 at 6:12 PM topgunhaides . <sunzhi....@gmail.com> wrote:

> Hi guys,
>
> Does clustering change conformer ID? See code below:
>
> from rdkit import Chem
> from rdkit.Chem import AllChem, TorsionFingerprints
> from rdkit.ML.Cluster import Butina
>
> mh = Chem.AddHs(Chem.MolFromSmiles('OCCCN'))
> AllChem.EmbedMultipleConfs(mh, numConfs=5, maxAttempts=1000,
>                            pruneRmsThresh=1.0, numThreads=0, randomSeed=-1)
>
> print([conf.GetId() for conf in mh.GetConformers()])
>
> mh.RemoveConformer(0)
> mh.RemoveConformer(1)
>
> print([conf.GetId() for conf in mh.GetConformers()])
>
> m = Chem.RemoveHs(mh)
> mat_a = AllChem.GetConformerRMSMatrix(m, prealigned=False)
> mat_b = TorsionFingerprints.GetTFDMatrix(m)
> num = m.GetNumConformers()
> clusters_a = Butina.ClusterData(mat_a, num, distThresh=2.0,
> isDistData=True, reordering=False)
> clusters_b = Butina.ClusterData(mat_b, num, distThresh=2.0,
> isDistData=True, reordering=False)
>
> print(clusters_a)
> print(clusters_b)
>
> print([conf.GetId() for conf in mh.GetConformers()])
>
>
> Here is the result:
> [0, 1, 2, 3, 4]
> [2, 3, 4]
> ((2, 0, 1),)
> ((2, 0, 1),)
> [2, 3, 4]
>
> You see it does not actually change the id in mh, but the result in the
> tuple from clustering is actually index. Is this a bug? This could be
> misleading when you try to grab conformer ids from the clustering result.
> Thank you!
>
> Best,
> Leon
>
>
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to