[Rdkit-discuss] RDKit molecule standardization/normalization protocol

JP Ebejer Thu, 17 Jun 2021 11:38:07 -0700

Dear all,

I am trying to standardize(/normalize?) some molecules from different
sources, to generate a set of descriptors for them.  I have done this a
number of times, and each time I find the process slightly confusing.  I
have the following questions please, if you don't mind:


1.  What is the relation between molvs and rdkit (I remember there was an
integration project between the two a while back).  When I call
rdMolStandardize does rdkit code or molvs code get called?  The github repo
for molvs hasn't been updated in a while (2 yrs), but rdMolStandardize has.
2.  What is the difference between standardization and normalization of a
molecule?  Does one automatically imply the other or should these two
processes be both run on a molecule?
3.  Specifically, what is the difference between
rdMolStandardize.Cleanup(mol), Chem.SanitizeMol(mol),
rdMolStandardize.Normalize(mol).  Should I call any of these manually three
after I run "standardization/cleaning operations" such as uncharging,
reionizing, etc?
4.  I understand what uncharge does, but what does reionizer do?
5.  Is there a way to chain operations together
standardize+ChooseLargestFragment+uncharge+normalize (am not sure the order
makes sense here), other than creating a class instance for each calling
the method, returning a new mol and using this mol in the next operation?

Apologies for the many questions.  Have I missed the documentation about
this?  I have found some excellent examples here:
https://github.com/susanhleung/rdkit/blob/dev/GSOC2018_MolVS_Integration/rdkit/Chem/MolStandardize/tutorial/MolStandardize.ipynb
(thanks!).  This is not exactly a cleaning pipeline, but still quite
helpful to understand these methods.

Many thanks,
JP

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] RDKit molecule standardization/normalization protocol

Reply via email to