Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-28 Thread Paolo Tosco
HI JP, you are welcome, thanks a lot for reporting the problem with a reproducible! No need to bother filing a GitHub issue, I have already done that and also submitted a fix: https://github.com/rdkit/rdkit/pull/4282 Reionizing is good to make sure that charges are shuffled around if needed and l

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-28 Thread JP Ebejer
Hi Paolo! Nice to hear from you -- and thanks for the lightning-fix+working example. Very helpful as usual. (I don't imagine you need me to open a github issue on this, but I'd be happy to if you think that is helpful/want to keep a record). Any thoughts on whether it is useful to reionize after

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-24 Thread Paolo Tosco
Hi JP, the problem is caused by the reaction SMARTS that standardizes pyridine *N*-oxides being not very specific and also hitting your molecule, which is not actually an *N*-oxide but rather a *N*-hydroxypyridinium ion. I will submit a PR to fix the reaction pattern; in the meantime you can fix t

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-24 Thread JP Ebejer
Apologies I took my sweet time to reply, I went down the standardization rabbit-hole and went through most of the material (thanks Matthew and Francois, but also links from other notebooks). The recording of the OpenScience session is excellent and crystal clear as usual Greg. I enjoyed that. I

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-22 Thread Francois Berenger
Dear JP, To confuse you even more, you can also have a look at the ChEMBL open-source molecular standardizer: https://github.com/chembl/ChEMBL_Structure_Pipeline/blob/master/chembl_structure_pipeline/standardizer.py No need to thank me. :D On 18/06/2021 03:12, JP Ebejer wrote: Dear all, I

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-18 Thread Greg Landrum
Hi JP, On Thu, Jun 17, 2021 at 8:37 PM JP Ebejer wrote: > > I am trying to standardize(/normalize?) some molecules from different > sources, to generate a set of descriptors for them. I have done this a > number of times, and each time I find the process slightly confusing. I > have the follow

Re: [Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-17 Thread Matthew Robinson
Hi JP, Lots of good questions, and it is quite an involved topic. I'll let others who are more knowledgeable of the background answer questions on the history and relationship between the tools. One resource that may be helpful is the https://github.com/chembl/ChEMBL_Structure_Pipeline repo, whi

[Rdkit-discuss] RDKit molecule standardization/normalization protocol

2021-06-17 Thread JP Ebejer
Dear all, I am trying to standardize(/normalize?) some molecules from different sources, to generate a set of descriptors for them. I have done this a number of times, and each time I find the process slightly confusing. I have the following questions please, if you don't mind: 1. What is the