On Sat, Nov 25, 2023 at 11:18 AM Jeppe Dakin <jeppe_da...@hotmail.com>
wrote:

> Double-precision numbers need at most 17 significant decimal digits to be
> serialised losslessly. Yet, savetxt() uses 19 by default, meaning that most
> files produced with savetxt() takes up about 9% more disk space than they
> need to, without any benefit. I have described the problem more detailed on
> Stackoverflow:
>
> https://stackoverflow.com/questions/77535380/minimum-number-of-digits-for-exact-double-precision-and-the-fmt-18e-of-numpy
>
> Is there any reason behind the default choice of savetxt(...,
> fmt='%.18e')? If not, why not reduce it to savetxt(..., fmt='%.16e')?
>

A long time ago when `savetxt()` was written, we did not use the reliable
Dragon4 string representation algorithm that guarantees that floating point
numbers are written out with the minimum number of decimal digits needed to
reproduce the number. We may even have relied on the platform's floating
point to string conversion routines, which were of variable quality. The
extra digits accounted for that unreliability.

It probably could be changed now, but I'd want more aggressive testing of
the assertion of correctness (`random()`, as used in that StackOverflow
demonstration, does *not* exercise a lot of the important edge cases in the
floating point format). But if your true concern is that 9% of disk space,
you probably don't want to be using `savetxt()` in any case.

-- 
Robert Kern
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to