Hi Greg, Thanks for the help!
Sorry for the confusion. I was trying to get symmetric RMS matrix using GetBestRMS, because the GetConformerRMSMatrix use standard RMS method without considering symmetry. A further question, is it possible to include the "GetBestRMS " option for "EmbedMultipleConfs" in the near future? I found that I lost a significant amount of conformers retained by "EmbedMultipleConfs", after I do a "post-pruning" using "GetBestRMS" (with same RMS threshold). My "post-pruning" code do the same type of pruning (first conformer is retained and from then on only those that are at least rmsd_threshold away from all retained conformations are kept) Thank you! Leon On Mon, Dec 2, 2019 at 2:37 AM Greg Landrum <greg.land...@gmail.com> wrote: > Hi Leon, > > I'm not sure I understand the question. The clustering code returns a > tuple of indices of the clusters. Those indices are relative to the > indexing of the distance matrix. The `ClusterData` function doesn't know > what you're clustering, so there's no way it could know anything about > cluster IDs. > > In your case, the way to get the conformer IDs of the conformers in the > first cluster would be something like (not tested): > > confs = mh.GetConformers() > print([confs[x].GetId() for x in clusters_a[0]]) > > -greg > > > > On Mon, Nov 25, 2019 at 6:12 PM topgunhaides . <sunzhi....@gmail.com> > wrote: > >> Hi guys, >> >> Does clustering change conformer ID? See code below: >> >> from rdkit import Chem >> from rdkit.Chem import AllChem, TorsionFingerprints >> from rdkit.ML.Cluster import Butina >> >> mh = Chem.AddHs(Chem.MolFromSmiles('OCCCN')) >> AllChem.EmbedMultipleConfs(mh, numConfs=5, maxAttempts=1000, >> pruneRmsThresh=1.0, numThreads=0, >> randomSeed=-1) >> >> print([conf.GetId() for conf in mh.GetConformers()]) >> >> mh.RemoveConformer(0) >> mh.RemoveConformer(1) >> >> print([conf.GetId() for conf in mh.GetConformers()]) >> >> m = Chem.RemoveHs(mh) >> mat_a = AllChem.GetConformerRMSMatrix(m, prealigned=False) >> mat_b = TorsionFingerprints.GetTFDMatrix(m) >> num = m.GetNumConformers() >> clusters_a = Butina.ClusterData(mat_a, num, distThresh=2.0, >> isDistData=True, reordering=False) >> clusters_b = Butina.ClusterData(mat_b, num, distThresh=2.0, >> isDistData=True, reordering=False) >> >> print(clusters_a) >> print(clusters_b) >> >> print([conf.GetId() for conf in mh.GetConformers()]) >> >> >> Here is the result: >> [0, 1, 2, 3, 4] >> [2, 3, 4] >> ((2, 0, 1),) >> ((2, 0, 1),) >> [2, 3, 4] >> >> You see it does not actually change the id in mh, but the result in the >> tuple from clustering is actually index. Is this a bug? This could be >> misleading when you try to grab conformer ids from the clustering result. >> Thank you! >> >> Best, >> Leon >> >> >> >> >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss