Hi Greg, > The RDKit doesn't normally convert data field values into floats unless you explicitly ask it to
I did notice that mol.GetProp() will always return things by string, and you would need to use mol.GetDoubleProp() if you explicitly wanted a numeric value, but it looks like mol.GetPropsAsDict() will automatically convert to integers/floating point as appropriate. I guess I was wondering if there was a way to get GetPropsAsDict() to be more gregarious with the locale (and/or make GetDoubleProp() more robust to not raising an exception). But if I need to handle the locale re-parsing on my own, I can probably knock something together to do that. Luckily the CTAB section in my files are all the same C locale, so I don't have to worry about that headache. Thanks, Rocco On Fri, Sep 30, 2022 at 9:21 AM Greg Landrum <greg.land...@gmail.com> wrote: > Hi Rocco, > > Paolo already replied about the options available for python when > interpreting the data fields from an SDF. The RDKit doesn't normally > convert data field values into floats unless you explicitly ask it to, so > this would be fine to do from Python > > The CTAB part of the SDF, which includes the coordinates, always parses > the coordinates using the C locale (regardless of what the current locale > on the machine is)... this is more or less part of the CTAB spec from MDL. > > -greg > > > On Thu, Sep 29, 2022 at 8:16 PM Rocco Moretti <rmoretti...@gmail.com> > wrote: > >> Hello, >> >> I have a number of SDFs of molecules with associated data blocks. (That >> is, the `>` section that comes after `M END` and before `$$$$`.) >> >> The problem I have is that these SDFs were generated in different >> countries, and have different locales -- most notably, some of them use "." >> as the decimal separator for real-valued properties and some use ",". To >> make things even more fun, some use a mix of both, depending on who >> calculated which properties where. >> >> Is there any facility in RDKit for reading in such locale-varying SDF >> files and normalizing them? >> >> Thanks, >> Rocco >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss