Hi JP,

Lots of good questions, and it is quite an involved topic.

I'll let others who are more knowledgeable of the background answer
questions on the history and relationship between the tools.

One resource that may be helpful is the
https://github.com/chembl/ChEMBL_Structure_Pipeline repo, which calls many
of the functions you mentioned. Looking into the code explains the order or
steps quite well. It also has an open access article linked in the README,
that explains at least how one group (ChEMBL) handles the process.
https://doi.org/10.1186/s13321-020-00456-1

Best,
Matt

On Thu, Jun 17, 2021 at 2:37 PM JP Ebejer <jean.p.ebe...@um.edu.mt> wrote:

> Dear all,
>
> I am trying to standardize(/normalize?) some molecules from different
> sources, to generate a set of descriptors for them.  I have done this a
> number of times, and each time I find the process slightly confusing.  I
> have the following questions please, if you don't mind:
>
> 1.  What is the relation between molvs and rdkit (I remember there was an
> integration project between the two a while back).  When I call
> rdMolStandardize does rdkit code or molvs code get called?  The github repo
> for molvs hasn't been updated in a while (2 yrs), but rdMolStandardize has.
> 2.  What is the difference between standardization and normalization of a
> molecule?  Does one automatically imply the other or should these two
> processes be both run on a molecule?
> 3.  Specifically, what is the difference between
> rdMolStandardize.Cleanup(mol), Chem.SanitizeMol(mol),
> rdMolStandardize.Normalize(mol).  Should I call any of these manually three
> after I run "standardization/cleaning operations" such as uncharging,
> reionizing, etc?
> 4.  I understand what uncharge does, but what does reionizer do?
> 5.  Is there a way to chain operations together
> standardize+ChooseLargestFragment+uncharge+normalize (am not sure the order
> makes sense here), other than creating a class instance for each calling
> the method, returning a new mol and using this mol in the next operation?
>
> Apologies for the many questions.  Have I missed the documentation about
> this?  I have found some excellent examples here:
> https://github.com/susanhleung/rdkit/blob/dev/GSOC2018_MolVS_Integration/rdkit/Chem/MolStandardize/tutorial/MolStandardize.ipynb
> (thanks!).  This is not exactly a cleaning pipeline, but still quite
> helpful to understand these methods.
>
> Many thanks,
> JP
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to