Hi Greg,

Thanks a lot! This is very helpful. Further questions:

1. If I need RMSD matrix for clustering, I guess I will have to figure out
a way to loop over all conformers to get the matrix first, if I choose to
use GetBestRMS()?

2. Does the AlignMolConformers() handle symmetry and align all permutations
to get the best "RMSlist"?

3. So I guess the EmbedMultipleConfs() also uses "standard" (no symmetry
consideration) method to compute RMS for pruning?

I appreciate your help!

Best,
Leon


On Thu, Nov 14, 2019 at 2:45 AM Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Leon,
>
> There's not really a "right" answer to your question - it depends on what
> you want to calculate.
> I personally think it makes more sense to use the heavy atom RMSD (which
> is what you get if you remove Hs before calculating the RMSD), particularly
> if you are comparing to experiment.
>
> Note that AllChem.GetConformerRMSMatrix() does not take symmetry into
> account, so you may not get the correct results.
> I just opened a ticket to fix this, but in the meantime if you have
> molecules with symmetry-equivalent atoms you are probably better off
> generating the conformer RMS matrix manually using GetBestRMS().
>
> Best,
> -greg
>
> On Wed, Nov 13, 2019 at 5:17 PM topgunhaides . <sunzhi....@gmail.com>
> wrote:
>
>> Hi guys,
>>
>> I am new to RDKit, and have a question about adding and removing Hs.
>>
>> As recommended in the documentation, hydrogen atoms should be added for
>> generating conformers, optimization, etc.
>>
>> However, for clustering, should the Hs be removed first, before
>> generating the conformer RMS matrix? For instance:
>>
>>
>> from rdkit import Chem
>> from rdkit.Chem import AllChem, TorsionFingerprints
>>
>> suppl = Chem.SDMolSupplier('molecule.sdf')
>>
>> for mol in suppl:
>>     mh = Chem.AddHs(mol)
>>     cids = AllChem.EmbedMultipleConfs(mh, numConfs=5, maxAttempts=1000,
>>                                       pruneRmsThresh=0.5, numThreads=0,
>> randomSeed=1)
>>     m = Chem.RemoveHs(mh)
>>     # RMS matrix
>>     rmsmat = AllChem.GetConformerRMSMatrix(m, prealigned=False)
>>     # TFD matrix
>>     tfdmat = TorsionFingerprints.GetTFDMatrix(m)
>>     print(rmsmat)
>>     print(tfdmat)
>>
>>
>> Note I remove the Hs before getting RMS and TFD matrices. Both resulting
>> matrices are different if I do not remove Hs. The RMS without Hs, in
>> general, tend to be smaller than the RMS with Hs. This will in turn affect
>> the subsequent clustering result.
>>
>> Could you guys give me some suggestions?  Thank you!
>>
>> Best,
>> Leon
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to