Dear all, I am trying to standardize(/normalize?) some molecules from different sources, to generate a set of descriptors for them. I have done this a number of times, and each time I find the process slightly confusing. I have the following questions please, if you don't mind:
1. What is the relation between molvs and rdkit (I remember there was an integration project between the two a while back). When I call rdMolStandardize does rdkit code or molvs code get called? The github repo for molvs hasn't been updated in a while (2 yrs), but rdMolStandardize has. 2. What is the difference between standardization and normalization of a molecule? Does one automatically imply the other or should these two processes be both run on a molecule? 3. Specifically, what is the difference between rdMolStandardize.Cleanup(mol), Chem.SanitizeMol(mol), rdMolStandardize.Normalize(mol). Should I call any of these manually three after I run "standardization/cleaning operations" such as uncharging, reionizing, etc? 4. I understand what uncharge does, but what does reionizer do? 5. Is there a way to chain operations together standardize+ChooseLargestFragment+uncharge+normalize (am not sure the order makes sense here), other than creating a class instance for each calling the method, returning a new mol and using this mol in the next operation? Apologies for the many questions. Have I missed the documentation about this? I have found some excellent examples here: https://github.com/susanhleung/rdkit/blob/dev/GSOC2018_MolVS_Integration/rdkit/Chem/MolStandardize/tutorial/MolStandardize.ipynb (thanks!). This is not exactly a cleaning pipeline, but still quite helpful to understand these methods. Many thanks, JP
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss