Hi guys, I am new to RDKit, and have a question about adding and removing Hs.
As recommended in the documentation, hydrogen atoms should be added for generating conformers, optimization, etc. However, for clustering, should the Hs be removed first, before generating the conformer RMS matrix? For instance: from rdkit import Chem from rdkit.Chem import AllChem, TorsionFingerprints suppl = Chem.SDMolSupplier('molecule.sdf') for mol in suppl: mh = Chem.AddHs(mol) cids = AllChem.EmbedMultipleConfs(mh, numConfs=5, maxAttempts=1000, pruneRmsThresh=0.5, numThreads=0, randomSeed=1) m = Chem.RemoveHs(mh) # RMS matrix rmsmat = AllChem.GetConformerRMSMatrix(m, prealigned=False) # TFD matrix tfdmat = TorsionFingerprints.GetTFDMatrix(m) print(rmsmat) print(tfdmat) Note I remove the Hs before getting RMS and TFD matrices. Both resulting matrices are different if I do not remove Hs. The RMS without Hs, in general, tend to be smaller than the RMS with Hs. This will in turn affect the subsequent clustering result. Could you guys give me some suggestions? Thank you! Best, Leon
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss