Charles R Harris wrote: > > > On Sat, Dec 27, 2008 at 11:46 PM, Robert Kern <robert.k...@gmail.com > <mailto:robert.k...@gmail.com>> wrote: > > On Sun, Dec 28, 2008 at 01:38, Charles R Harris > <charlesr.har...@gmail.com <mailto:charlesr.har...@gmail.com>> wrote: > > > > On Sat, Dec 27, 2008 at 10:27 PM, David Cournapeau > > <da...@ar.media.kyoto-u.ac.jp > <mailto:da...@ar.media.kyoto-u.ac.jp>> wrote: > >> > >> Hi, > >> > >> While looking at the last failures of numpy trunk on windows for > >> python 2.5 and 2.6, I got into floating point number formatting > issues; > >> I got deeper and deeper, and now I am lost. We have several > problems: > >> - we are not consistent between platforms, nor are we consistent > >> with python > >> - str(np.float32(a)) is locale dependent, but python str > method is > >> not (locale.str is) > >> - formatting of long double does not work on windows because > of the > >> broken long double support in mingw. > >> > >> 1 consistency problem: > >> ---------------------- > >> > >> python -c "a = 1e20; print a" -> 1e+020 > >> python26 -c "a = 1e20; print a" -> 1e+20 > >> > >> In numpy, we use PyOS_snprintf for formatting, but python > itself uses > >> PyOS_ascii_formatd - which has different behavior on different > versions > >> of python. The above behavior can be simply reproduced in C: > >> > >> #include <Python.h> > >> > >> int main() > >> { > >> double x = 1e20; > >> char c[200]; > >> > >> PyOS_ascii_format(c, sizeof(c), "%.12g", x); > >> printf("%s\n", c); > >> printf("%g\n", x); > >> > >> return 0; > >> } > >> > >> On 2.5, this will print: > >> > >> 1e+020 > >> 1e+020 > >> > >> But on 2.6, this will print: > >> > >> 1e+20 > >> 1e+020 > >> > >> 2 locale dependency: > >> -------------------- > >> > >> Another issue is that our own formatting is local dependent, > whereas > >> python isn't: > >> > >> import numpy as np > >> import locale > >> locale.setlocale(locale.LC_NUMERIC, 'fr_FR') > >> a = 1.2 > >> > >> print "str(a)", str(a) > >> print "locale.str(a)", locale.str(a) > >> print "str(np.float32(a))", str(np.float32(a)) > >> print "locale.str(np.float32(a))", locale.str(np.float32(a)) > >> > >> Returns: > >> > >> str(a) 1.2 > >> locale.str(a) 1,2 > >> str(np.float32(a)) 1,2 > >> locale.str(np.float32(a)) 1,20000004768 > >> > >> I thought about copying the way python does the formatting in > the trunk > >> (where discrepancies between platforms have been fixed), but > this is not > >> so easy, because it uses a lot of code from different places - > and the > >> code needs to be adapted to float and long double. The other > solution > >> would be to do our own formatting, but this does not sound easy: > >> formatting in C is hard. I am not sure about what we should do, if > >> anyone else has any idea ? > > > > I think the first thing to do is make a decision on locale. If > we chose to > > support locales I don't see much choice but to depend Python > because it's > > too much work otherwise, and work not directly related to Numpy > at that. If > > we decide not to support locales then we can do our own > formatting if we > > need to using a fixed choice of locale. There is a list of snprintf > > implementations here. Trio looks like a mature project and has > an MIT > > license, which I think is a license compatible with Numpy. > > We should not support locales. The string representations of these > elements should be Python-parseable. > > > I'm inclined to just fix the locale and ignore the rest until > Python gets > > things sorted out. But I'm lazy... > > What do you think Python doesn't have sorted out? > > > Consistency between versions and platforms. David's note with the > ticket points to a Python 3.0 bug on this reported about, oh, two > years ago.
As an example: in python 2.6, they solved some issues like inf/nan by interpreting the strings in python before outputting them, but we do not use their fix. So we have: python -c "import numpy as np; print np.log(0)" -> -inf (python 2.6) / -1.#INF (2.5, which is the format from the MS runtime). But: python -c "import numpy as np; print np.log(0).astype(np.float32)" -> -1.#INF (both 2.6 and 2.5) Etc... We can't be consistent with ourselves and with python at the same time, I think. I don't know which one is best: numpy being consistent through platforms and python versions, or being consistent with python. > There is also the problem of long doubles on the windows platform, > which isn't Python specific since Python doesn't use long doubles. As > I understand long doubles on windows, mingw32 supports them, VS > doesn't, so there is a compiler inconsistency to deal with also. To be exact, both mingw and VS support long double sensu stricto: the long double type is available. But sizeof(long double) == sizeof(double) with VS toolchain, and sizeof(long double) is 12 with mingw. The later is a pain, because mingw use both MS runtime (printf) and its own function (some math funcs), so we can't easily be consistent (either 8 or 12 bytes long double) with mingw. One solution would be to use the mingwex printf (a printf reimplementation available on recent mingwrt) instead of MSVC runtime - I would hope that this one is fixed wrt long double. This problem is even worse on 64 bits (long double are 16 bytes by default there with mingw). cheers, David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion