Hi Rocco,
the locale Python module will allow you to do this sort of normalizations
on strings, e.g.:
import locale
locale.getlocale()
('en_US', 'UTF-8')
locale.setlocale(locale.LC_ALL, "it_IT")
'it_IT'
locale.delocalize("1,222")
'1.222'
But this requires you to know the locale the values where originally encoded in.
HTH, cheers
p.
On Thu, Sep 29, 2022 at 8:16 PM Rocco Moretti <[email protected]> wrote:
> Hello,
>
> I have a number of SDFs of molecules with associated data blocks. (That
> is, the `>` section that comes after `M END` and before `$$$$`.)
>
> The problem I have is that these SDFs were generated in different
> countries, and have different locales -- most notably, some of them use "."
> as the decimal separator for real-valued properties and some use ",". To
> make things even more fun, some use a mix of both, depending on who
> calculated which properties where.
>
> Is there any facility in RDKit for reading in such locale-varying SDF
> files and normalizing them?
>
> Thanks,
> Rocco
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss