Dear JP,

To confuse you even more, you can also have a look at the ChEMBL open-source molecular standardizer:

No need to thank me. :D

On 18/06/2021 03:12, JP Ebejer wrote:
Dear all,

I am trying to standardize(/normalize?) some molecules from different
sources, to generate a set of descriptors for them.  I have done this
a number of times, and each time I find the process slightly
confusing.  I have the following questions please, if you don't mind:

1.  What is the relation between molvs and rdkit (I remember there was
an integration project between the two a while back).  When I call
rdMolStandardize does rdkit code or molvs code get called?  The github
repo for molvs hasn't been updated in a while (2 yrs), but
rdMolStandardize has.
2.  What is the difference between standardization and normalization
of a molecule?  Does one automatically imply the other or should these
two processes be both run on a molecule?
3.  Specifically, what is the difference between
rdMolStandardize.Cleanup(mol), Chem.SanitizeMol(mol),
rdMolStandardize.Normalize(mol).  Should I call any of these manually
three after I run "standardization/cleaning operations" such as
uncharging, reionizing, etc?
4.  I understand what uncharge does, but what does reionizer do?
5.  Is there a way to chain operations together
standardize+ChooseLargestFragment+uncharge+normalize (am not sure the
order makes sense here), other than creating a class instance for each
calling the method, returning a new mol and using this mol in the next

Apologies for the many questions.  Have I missed the documentation
about this?  I have found some excellent examples here:
(thanks!).  This is not exactly a cleaning pipeline, but still quite
helpful to understand these methods.

Many thanks,
Rdkit-discuss mailing list

Rdkit-discuss mailing list

Reply via email to