Hi Greg,

Thanks for the help!

Sorry for the confusion. I was trying to get symmetric RMS matrix
using GetBestRMS, because the GetConformerRMSMatrix use standard RMS method
without considering symmetry.

A further question, is it possible to include the "GetBestRMS " option for
"EmbedMultipleConfs" in the near future?
I found that I lost a significant amount of conformers retained by
"EmbedMultipleConfs", after I do a "post-pruning" using "GetBestRMS" (with
same RMS threshold). My "post-pruning" code do the same type of pruning
(first conformer is retained and from then on only those that are at least
rmsd_threshold away from all retained conformations are kept)

Thank you!
Leon




On Mon, Dec 2, 2019 at 2:37 AM Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Leon,
>
> I'm not sure I understand the question. The clustering code returns a
> tuple of indices of the clusters. Those indices are relative to the
> indexing of the distance matrix. The `ClusterData` function doesn't know
> what you're clustering, so there's no way it could know anything about
> cluster IDs.
>
> In your case, the way to get the conformer IDs of the conformers in the
> first cluster would be something like (not tested):
>
> confs = mh.GetConformers()
> print([confs[x].GetId() for x in clusters_a[0]])
>
> -greg
>
>
>
> On Mon, Nov 25, 2019 at 6:12 PM topgunhaides . <sunzhi....@gmail.com>
> wrote:
>
>> Hi guys,
>>
>> Does clustering change conformer ID? See code below:
>>
>> from rdkit import Chem
>> from rdkit.Chem import AllChem, TorsionFingerprints
>> from rdkit.ML.Cluster import Butina
>>
>> mh = Chem.AddHs(Chem.MolFromSmiles('OCCCN'))
>> AllChem.EmbedMultipleConfs(mh, numConfs=5, maxAttempts=1000,
>>                            pruneRmsThresh=1.0, numThreads=0,
>> randomSeed=-1)
>>
>> print([conf.GetId() for conf in mh.GetConformers()])
>>
>> mh.RemoveConformer(0)
>> mh.RemoveConformer(1)
>>
>> print([conf.GetId() for conf in mh.GetConformers()])
>>
>> m = Chem.RemoveHs(mh)
>> mat_a = AllChem.GetConformerRMSMatrix(m, prealigned=False)
>> mat_b = TorsionFingerprints.GetTFDMatrix(m)
>> num = m.GetNumConformers()
>> clusters_a = Butina.ClusterData(mat_a, num, distThresh=2.0,
>> isDistData=True, reordering=False)
>> clusters_b = Butina.ClusterData(mat_b, num, distThresh=2.0,
>> isDistData=True, reordering=False)
>>
>> print(clusters_a)
>> print(clusters_b)
>>
>> print([conf.GetId() for conf in mh.GetConformers()])
>>
>>
>> Here is the result:
>> [0, 1, 2, 3, 4]
>> [2, 3, 4]
>> ((2, 0, 1),)
>> ((2, 0, 1),)
>> [2, 3, 4]
>>
>> You see it does not actually change the id in mh, but the result in the
>> tuple from clustering is actually index. Is this a bug? This could be
>> misleading when you try to grab conformer ids from the clustering result.
>> Thank you!
>>
>> Best,
>> Leon
>>
>>
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to