[issue1580] Use shorter float repr when possible

2014-03-07 Thread Eric Snow

Changes by Eric Snow ericsnowcurren...@gmail.com:


--
nosy: +eric.snow

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2010-09-30 Thread Ole Laursen

Ole Laursen o...@iola.dk added the comment:

Just came across this bug, I don't want to reopen this or anything, but 
regarding the SSE2 code I couldn't help thinking that why can't you just detect 
the presence of SSE2 when the interpreter starts up and then switch 
implementations based on that? I think that's what liboil does 
(http://liboil.freedesktop.org/wiki/).

--
nosy: +olau

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-19 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

[Raymond]
 Is there a way to use SSE when available and x86 when it's not.

I guess it's possible in theory, but I don't know of any way to do this in 
practice.  I suppose one could trap the SIGILL generated by the attempted 
use of an SSE2 instruction on a non-supported platform---is this how 
things used to work for 386s without the 387?  That would make a sequence 
of floating-point instructions on non-SSE2 x86 horribly slow, though.

Antoine: as Raymond said, the advantage of SSE2 for numeric work is 
accuracy, predictability, and consistency across platforms.  The SSE2 
instructions finally put an end to all the problems arising from the 
mismatch between the precision of the x87 floating-point registers (64-
bits) and the precision of a C double (53-bits).  Those difficulties 
include (1) unpredictable rounding of intermediate values from 64-bit 
precision to 53-bit precision, due to spilling of temporaries from FPU 
registers to memory, and (2) double-rounding.  The arithmetic of Python 
itself is largely immune to the former, but not the latter.  (And of 
course the register spilling still causes headaches for various bits of 
CPython).

Those difficulties can be *mostly* dealt with by setting the x87 rounding 
precision to double (instead of extended), though this doesn't fix the 
exponent range, so one still ends up with double-rounding on underflow.  
The catch is that one can't mess with the x87 state globally, as various 
library functions (especially libm functions) might depend on it being in 
whatever the OS considers to be the default state.

There's a very nice paper by David Monniaux that covers all this:  
definitely recommended reading after you've finished Goldberg's What 
Every Computer Scientist  It can currently be found at:

http://hal.archives-ouvertes.fr/hal-00128124/en/

An example: in Python (any version), try this:

 1e16 + 2.
10002.0

On OS X, Windows and FreeBSD you'll get the answer above.
(OS X gcc uses SSE2 by default; Windows and FreeBSD both
make the default x87 rounding-precision 53 bits).

On 32-bit Linux/x86 or Solaris/x86 you'll likely get the answer

10004.0

instead, because Linux doesn't (usually?) change the Intel default
rounding precision of 64-bits.  Using SSE2 instead of the x87 would have 
fixed this.

/standard x87 rant

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-18 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Hello folks,

IIUC, autoconf tries to enable SSE2 by default without asking. Isn't it
a problem for people distributing Python binaries (e.g. Linux vendors)
and expecting these binaries to work on legacy systems even though the
system on which the binaries were compiled supports SSE2?

Or am I misunderstanding what the changes are?

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-18 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Yes, I think you're right.

Perhaps the SSE2 support should be turned into an --enable-sse2 configure 
option, that's disabled by default?  One problem with this is that I don't 
know how to enable SSE2 instructions for compilers other than gcc, so the 
option would only apply to gcc-based systems.

Disabling SSE2 would just mean that all x86/gcc systems would end up using 
the inline assembler to get and set the control word;  so we'd still get 
short float repr everywhere we did before.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-18 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Perhaps better to drop the SSE2 bits completely.  Anybody who
actually wants SSE2 instructions in their binary can do a

CC=gcc -msse2 -mfpmath=sse configure  ...

Unless there are objections, I'll drop everything involving SSE2 from 
the configure script.

It's a bit of a shame, though:  it's definitely desirable to be using 
the SSE2 instructions for floating-point whenever possible.  Fewer 
surprises due to double rounding or random 80-bit register spilling, 
better performance on underflow and NaNs, ...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-18 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

SSE2 detection and flags removed in r71723.  We'll see how the buildbots 
fare...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-18 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Is there a way to use SSE when available and x86 when it's not.

Probably, but I don't think there is any point doing so. The main
benefit of SSE2 is to get higher performance on floating point intensive
code, which no pure Python code could be regarded as (the CPU cycles
elapsed between two FP instructions would dwarf the cost of executing
the FP instructions themselves).

The situation is different for specialized packages like numpy, but
their decisions need not be the same as ours AFAIK.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-18 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

The advantage is accuracy.  No double rounding.  This will also help the
math.fsum() function that is also susceptible to double rounding.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-17 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Closing this.  There are a few known problems remaining, but they've all 
got their own issue numbers:  see issue 5780, issue 4482.

--
resolution:  - accepted
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-16 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

The py3k-short-float-repr branch has been merged to py3k in two parts:

r71663 is mostly concerned with the inclusion of David Gay's code into the 
core, and the necessary floating-point fixups to allow Gay's code to be 
used (SSE2 detection, x87 control word manipulation, etc.)

r71665 contains Eric's *mammoth* rewrite and upgrade of the all the float 
formatting code to use the new _Py_dg_dtoa and _Py_dg_strtod functions.

Note: the new code doesn't give short float repr on *all* platforms, 
though it's close.  The sticking point is that Gay's code needs 53-bit 
rounding precision, and the x87 FPU uses 64-bit rounding precision by 
default (though some operating systems---e.g., FreeBSD, Windows---change 
that default).  For the record, here's the strategy that I used:  feedback 
(esp. from experts) would be appreciated.

- If the float format is not IEEE 754, don't use Gay's code.

- Otherwise, if we're not on x86, we're probably fine.
  (Historically, there are other FPUs that have similar
  problems---e.g., the Motorola 68881/2 FPUs---but I think they're
  all too old to be worth worrying about by now.)

- x86-64 in 64-bit mode is also fine: there, SSE2 is the default.
  x86-64 in 32-bit mode (e.g., 32-bit Linux on Core 2 Duo) has
  the same problems as plain x86.  (OS X is fine: its gcc has
  SSE2 enabled by default even for 32-bit mode.)

- Windows/x86 appears to set rounding precision to 53-bits by default,
  so we're okay there too.

So:

- On gcc/x86, detect the availability of SSE2 (by examining
  the result of the cpuid instruction) and add the appropriate
  flags (-msse2 -mfpmath=sse2) to BASECFLAGS if SSE2 is available.

- On gcc/x86, if SSE2 is *not* available, so that we're using the
  x87 FPU, use inline assembler to set the rounding precision
  (and rounding mode) before calling Gay's code, and restore
  the FPU state directly afterwards.   Use of inline assembler is pretty
  horrible, but it seems to be *more* portable than any of the
  alternatives.  The official C99 way is to use fegetenv/fesetenv to get
  and set the floating-point environment, but the fenv_t type used to
  store the environment can (and will) vary from platform to platform.

- There's an autoconf test for double-rounding.  If there's no
  evidence of double rounding then it's likely to be safe to use
  Gay's code: double rounding is an almost unavoidable symptom of
  64-bit precision on x87.

So on non-Windows x86 platforms that *aren't* using gcc and *do* exhibit
double rounding (implying that they're not using SSE2, and that the OS 
doesn't change the FPU rounding precision to 53 bits), we're out of luck.  
In this case the old long float repr is used.

The most prominent platform that I can think of that's affected by this 
would be something like Solaris/x86 with Sun's own compiler, or more 
generally any Unix/x86 combination where the compiler isn't gcc.  Those 
platforms need to be dealt with on a case-by-case basis, by figuring out 
for each such platform how to detect and use SSE2, and how to get and set 
the x87 control word if SSE2 instructions aren't available.

Note that if any of the above heuristics is wrong and we end up using 
Gay's code inappropriately, then there will be *loud* failure:  we'll know 
about it.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:

My changes on the py3k-short-float-repr branch include:

- Create a new function PyOS_double_to_string. This will replace
PyOS_ascii_formatd. All existing internal uses of PyOS_ascii_formatd
follow this pattern: printf into a buffer to build up the format string,
call PyOS_ascii_formatd, then parse the format string to determine what
to do. The changes to the API are:
  - discrete parameters instead of a printf-like format string. No
parsing is required.
  - returns PyMem_Malloc'd memory instead of writing into a supplied
buffer.
- Modify all callers of PyOS_ascii_formatd to call PyOS_double_to_string
instead. These are:
  - Modules/_pickle.c
  - Objects/complexobject.c
  - Objects/floatobject.c
  - Objects/stringlib/formatter.h
  - Objects/unicodeobject.c
  - Python/marshal.c

Remaining to be done:
- Re-enable 'n' (locale-aware) formatting in float.__format__().
- Decide what to do about the change in meaning of alt formatting with
format code 'g' when an exponent is present, also in float.__format__().
- Have marshal.c deal with potential memory failures returned from
PyOS_double_to_string.
- Deprecate PyOS_ascii_formatd.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Changing target Python versions.

I'll upload a patchset to Rietveld sometime soon (later today, I hope).

--
versions: +Python 3.1 -Python 2.6, Python 3.0

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

I've uploaded the current version to Rietveld:

http://codereview.appspot.com/33084/show

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

The Rietveld patch set doesn't show the three new files, which are:

  Python/dtoa.c
  Include/dtoa.h
  Lib/test/formatfloat_testcases.txt

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

So work on the py3k-short-float-repr branch is nearing completion, and
we (Eric and I) would like to get approval for merging these changes
into the py3k branch before this month's beta.

A proposal: I propose that the short float representation should be
considered an implementation detail for CPython, not a requirement for
Python the language.  This leaves Jython and IronPython and friends free
to do as they wish.  All that should be required for Python itself is
that float(repr(x)) == x for (non-nan, non-infinite) floats x.  Does
this seem reasonable to people following this issue?  If it's
controversial I'll take it to python-dev.

Eric's summarized his changes above.  Here are mine (mostly---some
of this has bits of Eric's work in too):

 - change repr(my_float) to produce output with minimal number of
digits, on IEEE 754 big- and little-endian systems (which includes all
systems we can actually get our hands on right now).

 - there's a new file Python/dtoa.c (and a corresponding file
Include/dtoa.h) with a cut-down version of Gay's dtoa and strtod in it.
There are comments at the top of that file indicating the changes that
have been made from Gay's original code.

 - one minor semantic change:  repr(x) changes to exponential format at
1e16 instead of 1e17.  This avoids a strange situation where Gay's code
produces a 16-digit 'shortest' integer string for a particular float and
Python's formatting code then pads with an incorrect '0':

 x = 2e16+8.  # 2e16+8. is exactly representable as a float
 x
20010.0

There's no way that this padding with bogus digits can happen for
numbers less than 1e16.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Those three missing files have now been added to Rietveld.

Just for reference, in case anyone else encounters this: the reason those 
files were missing from the initial upload was that after I svn merge'd 
from py3k-short-float-repr to py3k, those files were being treated (by 
svn) as *copies* from py3k-short-float-repr, rather than new files, so svn 
diff didn't show any output for them.  Doing:

svn revert Python/dtoa.c
svn add Python/dtoa.c

(and similarly for the other two files) fixed this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

On Tue, Apr 7, 2009 at 3:10 AM, Mark Dickinson rep...@bugs.python.org wrote:
 A proposal: I propose that the short float representation should be
 considered an implementation detail for CPython, not a requirement for
 Python the language.  This leaves Jython and IronPython and friends free
 to do as they wish.

In principle that's fine with me.

 All that should be required for Python itself is
 that float(repr(x)) == x for (non-nan, non-infinite) floats x.  Does
 this seem reasonable to people following this issue?  If it's
 controversial I'll take it to python-dev.

Historically, we've had a stronger requirement: if you print repr(x)
and ship that string to a different machine, float() of that string
returns the same value, assuming both systems use the same internal FP
representation (e.g. IEEE). Earlier in this bug (or elsewhere) Tim
Peters has pointed out that on some platforms, in particular Windows,
input conversion doesn't always round correctly and only works
correctly when the input has at least 17 digits (in which case you
apparently don't need to round, and truncation works just as well --
that's all the C standard requires, I believe).

Now that pickle and marshal no longer use repr() for floats I think
this is less important, but still at least worth considering. I think
by having our own input function we solve this if both hosts run
CPython, but it might break for other implementations.

In order to make progress I recommend that we just not this and don't
use it to hold up the implementation, but I think it's worth noticing
in docs somewhere.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 Historically, we've had a stronger requirement: if you print repr(x)
 and ship that string to a different machine, float() of that string
 returns the same value, assuming both systems use the same internal FP
 representation (e.g. IEEE).

Hmm.  With the py3k-short-float-repr stuff, we should be okay moving 
things between different CPython boxes, since float() would use Gay's code 
and so would be correctly rounded for any length input, not just input 
with 17 significant digits.

But for a CPython-generated short repr to give the correct value on some 
other implementation, that implementation would also have to have a 
correctly rounded string-float.

For safety's sake, I'll make sure that marshal (version 1) and pickle 
(protocol 0) are still outputting the full 17 digits.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Jared Grubb

Jared Grubb pyt...@jaredgrubb.com added the comment:

I think ANY attempt to rely on eval(repr(x))==x is asking for trouble,
and it should probably be removed from the docs.

Example: The following C code can vary *even* on a IEEE 754 platform,
even in two places in the same source file (so same compile options),
even in the same 5 lines of code recompiled after changing code that
does even not touch/read 'x' or 'y':

double x, y;
x = 3.0/7.0;
y = x;
/* ... code that doesnt touch/read x or y ... */
printf( x==y: %s, (x==y) ? true : false);

So, how can we hope that eval(repr(x))==x is EVER stable? Equality and
floating point should raise the hairs on the back of everyone's neck...

(Code above based on
http://docs.sun.com/source/806-3568/ncg_goldberg.html in section
Current IEEE 754 Implementations, along with a great explanation on
why this is true. The code example is a little different, but the same
concept applies.)

--
nosy: +jaredgrubb

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 I think ANY attempt to rely on eval(repr(x))==x is asking for trouble,
 and it should probably be removed from the docs.

I disagree.  I've read the paper you refer to; nevertheless, it's still 
perfectly possible to guarantee eval(repr(x)) == x in spite of the 
various floating-point details one has to deal with.   In the worst 
case, assuming that C doubles are IEEE 754 binary64 format, for double -
 string conversion one can simply cast the double to an array of 8 
bytes, extract the sign, exponent and mantissa from those bytes, and 
then compute the appropriate digit string using nothing but integer 
arithmetic;  similarly for the reverse operation: take a digit string, 
do some integer arithmetic to figure out what the exponent, mantissa and 
sign should be, and pack those values back into the double.  Hey presto:  
correctly rounded string - double and double - string conversions!  
Add a whole bag of tricks to speed things up in common cases and you end 
up with something like David Gay's code.

It *is* true that the correctness of Gay's code depends on the FPU being 
set to use 53-bit precision, something that isn't true by default on x87 
FPUs.  But this isn't hard to deal with, either:  first, you make sure 
that you're not using the x87 FPU unless you really have to (all Intel 
machines that are newer than Pentium 3 have the SSE2 instruction set, 
which doesn't have the problems that x87 does);  second, if you *really* 
have to use x87 then you figure out how to set its control word to get 
53-bit precision and round-half-to-even rounding;  third, if you can't 
figure out how to set the control word then you use a fallback something 
like that described above.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-04-07 Thread Jared Grubb

Jared Grubb pyt...@jaredgrubb.com added the comment:

The process that you describe in msg85741 is a way of ensuring
memcmp(x, y, sizeof(x))==0, and it's portable and safe and is the
Right Thing that we all want and expect. But that's not x==y, as that
Sun paper explains. It's close, but not technically accurate, as the
implication arrow only goes one way (just as x=1/y implies xy=1 in
algebra, but not the other way around)

I'd be interested to see if you could say that the Python object
model/bytecode interpretation enforces a certain quauntum of operations
that actually does imply eval(repr(x))==x; but I suspect it's
unprovable, and it's fragile as Python grows to have more support in
CLI/LLVM/JVM backends. 

My pedantic mind would strip any and all references to floating-point
equality out of the docs, as it's dangerous and insidiously misleading,
even in obvious cases. But, I'll stop now :) (FYI: I've enjoyed the
~100 messages here.. Great stuff!)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-30 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
nosy: +eric.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-30 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Eric and I have set up a branch of py3k for work on this issue.  URL for 
(read-only) checkout is:

http://svn.python.org/projects/python/branches/py3k-short-float-repr

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-05 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

Would it be acceptable to use shorter float repr only on big-endian and 
little-endian IEEE 754 platforms, and use the full 17-digit repr on other 
platforms?  This would greatly simplify the adaptation and testing of 
Gay's code.

Notable platforms that would still have long repr under this scheme, in
order of decreasing significance:

  - IBM zSeries mainframes when using their hexadecimal FPU (recent
models also have IEEE 754 floating-point, and Linux on zSeries
uses that in preference to the hex floating-point).
  - ARM, Old ABI (uses a mixed-endian IEEE 754 format;  a newer set
of ABIs appeared recently that use big or little-endian IEEE 754)
  - Cray SV1 (introduced 1998) and earlier Cray machines.  Gay's code
doesn't support the Cray floating-point format anyway.  I believe
that newer Crays (e.g., the X1, introduced in 2003) use IEEE
floating-point.
  - VAX machines, using VAX D and G floating-point formats.  More recent
Alpha machines support IEEE 754 floats as well.
  - many ancient machines with weird and wonderful floating-point
schemes

As far as I can determine, apart from the IBM machines described above all 
new platforms nowadays use an IEEE 754 double format.  Sometimes IEEE 
arithmetic is only emulated in software;  often there are shortcuts taken 
with respect to NaNs and underflow, but in all cases the underlying format 
is still IEEE 754, so Gay's code should work.  Gay's careful to avoid 
problems with machines that flush underflow to zero, for example.

Gay's code only actually supports 3 floating-point formats: IEEE 754 (big 
and little-endian), IBM hex format and VAX D floating-point format.  There 
are actually (at least?) two variants of D floating-point:  the version 
used on VAX, which has a 56-bit mantissa and an 8-bit exponent, and the 
Alpha version, which only has a 53-bit mantissa (but still only an 8-bit 
exponent);  Gay's code only supports the former.

I'm quite convinced that the VAX stuff in Gay's code is not worth much to 
us, so really the only question is whether it's worth keeping the code to 
support IBM hexadecimal floating-point.  Given the difficulty of testing 
this code and the (significant but not overwhelming) extra complexity it 
adds, I'd like to remove it.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-05 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

Sounds good to me.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-05 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

+1 on the fallback strategy for platforms we don't know how to handle.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-02 Thread Noam Raphael

Noam Raphael noamr...@gmail.com added the comment:

Do you mean msg58966?

I'm sorry, I still don't understand what's the problem with returning
f_15(x) if eval(f_15(x)) == x and otherwise returning f_17(x). You said
(msg69232) that you don't care if float(repr(x)) == x isn't
cross-platform. Obviously, the simple method will preserve eval(repr(x))
== x, no matter what rounding bugs are present on the platform.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-02 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

I changed my mind on the cross-platform requirement.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-01 Thread Noam Raphael

Noam Raphael noamr...@gmail.com added the comment:

I'm sorry, but it seems to me that the conclusion of the discussion in
2008 is that the algorithm should simply use the system's
binary-to-decimal routine, and if the result is like 123.456, round it
to 15 digits after the 0, check if the result evaluates to the original
value, and if so, return the rounded result. This would satisfy most
people, and has no need for complex rounding algorithms. Am I mistaken?

If I implement this, will anybody be interested?

Noam

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-03-01 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

I tried that, and it was more subtle than that in corner cases.

Another argument against it is that on Windows the system input routine
doesn't correctly round unless 17 digits of precision are given.  One of
Tim Peters's responses should have evidence of that (or perhaps it was
on a python-dev thread).

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-27 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

I'd be interested in working with Preston on adapting David Gay's code.

(I'm interested in looking at this anyway, but I'd much prefer to do it
in collaboration with someone else.)

It would be nice to get something working before the 3.1 betas, but that
seems a little optimistic.   2.7 and 3.2 are more likely targets.

Preston, are you going to be at PyCon?

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-27 Thread Skip Montanaro

Changes by Skip Montanaro s...@pobox.com:


--
nosy:  -skip.montanaro

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 The GNU library's float-string routines are based on David Gay's. 
 Therefore you can compare those to Gay's originals

Sounds reasonable.

 (which accounts for the extreme length and complexity of Gay's code).

Looking at the code, I'm actually not seeing *extreme* complexity.
There are a lot of ifdefs for things that Python probably doesn't
care about (e.g., worrying about whether the inexact and underflow
flags get set or not;  hexadecimal floats, which Python already has
its own code for; weird rounding modes, ...).  There's some pretty
standard looking bignum integer stuff (it might even be possible to
substitute PyLong arithmetic for this bit).  There's lots of boring
material for things like parsing numeric strings.  The really
interesting part of the code is probably only a few hundred lines.

I think it looks feasible to adapt Gay's code for Python.  I'm not sure
I'd want to adapt it without understanding it fully, but actually that
looks quite feasible too.

The x87 control word stuff isn't really a big problem:  it can be dealt
with.  The bit I don't know how to do here is using the autoconf
machinery to figure out whether the x87 FPU exists and is in use on a
particular machine.  (I put in an autoconf test for x87-related double
rounding recently, which should be a pretty reliable indicator, but that
still fails if someone's already set 53-bit rounding precision...).

A side-note:  for x86/x86_64, we should really be enabling SSE2 by
default whenever possible (e.g., on all machines newer than Pentium 3).

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Tim Peters

Tim Peters tim.pet...@gmail.com added the comment:

Mark, extreme complexity is relative to what's possible if you don't
care about speed; e.g., if you use only bigint operations very
straightforwardly, correct rounding amounts to a dozen lines of
obviously correct Python code.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

So is it worth trying to come up with a patch for this?  (Where this = 
making David Gay's code for strtod and dtoa usable from Python.)

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Tim Peters

Tim Peters tim.pet...@gmail.com added the comment:

Is it worth it?  To whom ;-) ?  It was discussed several times before on
various Python mailing lists, and nobody was willing to sign up for the
considerable effort required (both to update Gay's code and to fight
with shifting platform quirks ever after).

If /you/ have the time, interest, and commitment, then yes, it's a Very
Good Thing.  The only hangup has been that it's also a labor-intensive
thing, in an obscure area requiring true expertise.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

On Thu, Feb 26, 2009 at 1:01 PM, Tim Peters rep...@bugs.python.org wrote:
 Is it worth it?  To whom ;-) ?  It was discussed several times before on
 various Python mailing lists, and nobody was willing to sign up for the
 considerable effort required (both to update Gay's code and to fight
 with shifting platform quirks ever after).

 If /you/ have the time, interest, and commitment, then yes, it's a Very
 Good Thing.  The only hangup has been that it's also a labor-intensive
 thing, in an obscure area requiring true expertise.

I *think* I understood Preston as saying that he has the expertise and
is interested in doing this. He's just looking for guidance on how
we'd like the software engineering side of things to be done, and I
think he's getting good feedback here. (Preston, please butt in if I
misunderstood you.)

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Tim Peters

Tim Peters tim.pet...@gmail.com added the comment:

Huh.  I didn't see Preston volunteer to do anything here ;-)

One bit of software engineering for whoever does sign on:  nothing kills
porting a language to a new platform faster than needing to get an
obscure but core subsystem working.  So whatever is done here, to the
extent that it requires fiddling with obscure platform-specific gimmicks
for manipulating FPU state, to that extent also is it important to
retain a std C fallback (i.e., one requiring no porting effort).

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-26 Thread Preston Briggs

Preston Briggs prest...@google.com added the comment:

This all started with email to Guido that y'all didn't see,
wherein I wondered if Python was interested in such a thing.
Guido said: Sure, in principle, please see the discussion associated
with this change.

I probably don't have all the required expertise today,
but I am patient and persistent.  I wouldn't be very fast either,
'cause I have other things on my plate.  Nevertheless, it seems like
an interesting project and I'm happy to work on it.

If Mark or someone else wants to jump in, that's fine too.

Preston

Added file: http://bugs.python.org/file13199/unnamed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___This all started with email to Guido that y#39;all didn#39;t see,brwherein 
I wondered if Python was interested in such a thing.brGuido said: Sure, in 
principle, please see the discussion associatedbrwith this change.br
brI probably don#39;t have all the required expertise today,brbut I am 
patient and persistent.  I wouldn#39;t be very fast either,br#39;cause I 
have other things on my plate.  Nevertheless, it seems likebran interesting 
project and I#39;m happy to work on it.br
brIf Mark or someone else wants to jump in, that#39;s fine 
too.brbrPrestonbrbrbr
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Preston Briggs

Preston Briggs prest...@google.com added the comment:

In all this discussion, it seems that we have not discussed the
possibility of adapting David Gay's code, dtoa.c, which nicely handles
both halves of the problem.  It's also free and has been well exercised
over the years.

It's available here: http://www.netlib.org/fp/

It'd probably have to be touched up a bit.  Guido notes that, for
example, malloc is called without checking the return value. 
Nevertheless, the core is what we're looking for, I believe.

It's described in the paper: Correctly rounded binary-decimal and
decimal-binary conversions, David M. Gay

It comes recommended by Steele and White (output) and by Clinger
(input), in the retrospectives on their seminal papers.

--
nosy: +preston

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 It'd probably have to be touched up a bit.

This may be an understatement. :-)

In the first 50 lines of the 3897-line dtoa.c file, I see this warning:

/* On a machine with IEEE extended-precision registers, it is
 * necessary to specify double-precision (53-bit) rounding precision
 * before invoking strtod or dtoa.  [...] */

It seems to me that figuring out how and when to do this, and
writing and testing the appropriate configure/pyport/whatever
code, is already a nontrivial task.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Preston Briggs

Preston Briggs prest...@google.com added the comment:

 It'd probably have to be touched up a bit.

 This may be an understatement. :-)

Probably so.  Nevertheless, it's got to be easier
than approaching the problem from scratch.
And considering that this discussion has been
going on for over a year now, it might be a way to
move forward.

 In the first 50 lines of the 3897-line dtoa.c file, I see this warning:

 /* On a machine with IEEE extended-precision registers, it is
  * necessary to specify double-precision (53-bit) rounding precision
  * before invoking strtod or dtoa.  [...] */

 It seems to me that figuring out how and when to do this, and
 writing and testing the appropriate configure/pyport/whatever
 code, is already a nontrivial task.

I would consider compiling the library with flags appropriate to forcing
64-bit IEEE arithmetic if possible.  Another possibility is to explore
gdtoa.c
which is supposed to include code for extended precision, quad precision,
etc.

Added file: http://bugs.python.org/file13176/unnamed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___div class=gmail_quotegt;gt; It#39;d probably have to be touched up a 
bit.br
br
gt; This may be an understatement. :-)brbrProbably so.  Nevertheless, 
it#39;s got to be easierbrthan approaching the problem from scratch.brAnd 
considering that this discussion has beenbrgoing on for over a year now, it 
might be a way tobr
move forward.br
brgt; In the first 50 lines of the 3897-line dtoa.c file, I see this 
warning:br
gt; br
gt; /* On a machine with IEEE extended-precision registers, it isbr
gt;  * necessary to specify double-precision (53-bit) rounding precisionbr
gt;  * before invoking strtod or dtoa.  [...] */br
gt; br
gt; It seems to me that figuring out how and when to do this, andbrgt; 
writing and testing the appropriate configure/pyport/whateverbr
gt; code, is already a nontrivial task.br
brI would consider compiling the library with flags appropriate to 
forcingbr64-bit IEEE arithmetic if possible.  Another possibility is to 
explore gdtoa.cbrwhich is supposed to include code for extended precision, 
quad precision, etc.br
br/div
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Mark Dickinson

Mark Dickinson dicki...@gmail.com added the comment:

 I would consider compiling the library with flags appropriate to forcing
 64-bit IEEE arithmetic if possible.

Using the right compiler flags is only half the battle, though.  You 
should really be setting the rounding precision dynamically:  set it to 
53-bit precision before the call to strtod, and then restore it to 
whatever the original precision was afterwards.  And the magic formula for 
doing this is likely to vary from platform to platform.  As far as I can 
tell, the C99 standard doesn't suggest any standard spellings for changing 
the rounding precision.

It's not unknown for libm functions on x86 to blindly assume that the x87 
FPU is set to its default of extended-precision (64-bit) rounding, so it's 
risky to just set 53-bit rounding for the whole of Python's execution.

And contrary to popular belief, setting 53-bit rounding *still* doesn't 
give the x87 FPU IEEE 754 semantics:  there are issues at the extremes of 
the floating-point range, and I'm not 100% convinced that those issues 
wouldn't affect the dtoa.c strtod.

I'm not saying that all this is impossible;  just that it's subtle and 
would involve some work.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

Even if someone devoted the time to get possibly get this right, it
would be somewhat difficult to maintain.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Guido van Rossum

Guido van Rossum gu...@python.org added the comment:

What maintenance issues are you anticipating?

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Raymond Hettinger

Raymond Hettinger rhettin...@users.sourceforge.net added the comment:

Gay's code is 3800+ lines and includes many ifdef paths that we need to
get right.  Mark points out that the code itself needs additional work.
 The discussions so far also get into setting compiler flags on
different systems and I would imagine that that would need to be
maintained as those compilers change and as hardware gets updated.  The
main challenge for a maintainer is to be sure that any change doesn't
break something.  AFAICT, this code is very difficult to validate.

The comments is the first 180 lines of code give a good survey of the
issues a maintainer would face when given a bug report in the form of:
eval(repr(f)) != f for myhardware x compiled with mycompiler c with
myflags f and running on operating system z.  There reports would be
even more challenging for cross-platform comparisons.

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2009-02-25 Thread Tim Peters

Tim Peters tim.pet...@gmail.com added the comment:

The GNU library's float-string routines are based on David Gay's. 
Therefore you can compare those to Gay's originals to see how much
effort was required to make them mostly portable, and can look at the
history of those to get some feel for the maintenance burden.

There's no difficulty in doing this correctly except for denormals.  The
pain is almost entirely in doing it correctly /fast/ -- and that is
indeed hard (which accounts for the extreme length and complexity of
Gay's code).

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-11 Thread Mark Dickinson

Mark Dickinson [EMAIL PROTECTED] added the comment:

Mildly off-topic:  it seems that currently eval(repr(x)) == x isn't 
always true, anyway.  On OS X 10.5.4/Intel, I get:

 x = (2**52-1)*2.**(-1074)
 x
2.2250738585072009e-308
 y = eval(repr(x))
 y
2.2250738585072014e-308
 x == y
False

This is almost certainly an OS X bug...

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-11 Thread Tim Peters

Tim Peters [EMAIL PROTECTED] added the comment:

About (2**52-1)*2.**(-1074):  same outcome under Cygwin 2.5.1, which is
presumably based on David Gay's perfect rounding code.  Cool ;-)

Under the native Windows 2.5.1:

 x = (2**52-1)*2.**(-1074)
 x
2.2250738585072009e-308
 y = eval(repr(x))
 y
0.0

Hard to say whether that's actually worse :-(

About %.15g, good point!  It probably would make Guido happy.  It's
hard to know for sure, because despite what people say about this, the
real objection to the status quo is aesthetic, not technical, and
there's no disputing tastes.

My tastes agree that %.17g is a pain in the ass when using the
interactive shell -- I rarely want to see so many digits.  But %.15g,
and even %.12g, would /still/ be a pain in the ass, and for the same
reason.  Most times I'd be happiest with plain old %g (same as %.6g). 
OTOH, sometimes I do want to see all the bits, and sometimes I'd like a
fixed format (like %.2f), and so on.

Alas, all choices suck for strong reasons (either technical or aesthetic
 -- and, BTW, your lots of trailing zeroes idea won't fly because of
the latter).

The introduction of hex formats for floats should make this starkly
clear:  using hex I/O for float repr would be the ideal technical
choice:  easy to code, extremely fast, compact representation, and
highly portable.  Those are supposedly repr's goals, regardless of
whether the repr string is easily readable by humans.  But you know
without asking that using hex format for repr(float) has no chance of
being adopted -- because it's ugly :-)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-08 Thread Mark Dickinson

Mark Dickinson [EMAIL PROTECTED] added the comment:

[Tim]

 If you think using 16 (when possible) will stop complaints, think again
 ;-)  For example, ...

Aha!  But using *15* digits would be enough to eliminate all 1, 2, 3, 4, 
..., 15 digit 'surprises', wouldn't it?!  16 digits doesn't quite work, 
essentially because 2**-53 is still a little larger than 10**-16, but 15 
digits ought to be okay.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-08 Thread Mark Dickinson

Mark Dickinson [EMAIL PROTECTED] added the comment:

Here's the 'proof' that 15 digits should be enough:

Suppose that x is a positive (for simplicity) real number that's exactly 
representable as a decimal with = 15 digits.  We'd like to know that
'%.15g' % (nearest_float_to_x) recovers x.

There are integers k and m such that 2**(k-1) = x = 2**k, and 10**(m-1) 
 x = 10**m.  (e.g. k = ceiling(log_2(x)), m = ceiling(log_10(x)))

Let y be the closest floating-point number to x.  Then |y-x| = 2**(k-54).  
Hence |y-x| = x*2**-53  x/2 * 10**-15 = 10**(m-15)/2.  It follows that 
x is the closest 15-digit decimal to y, hence that applying '%.15g' to y 
should recover x exactly.

The key step here is in the middle inequality, which uses the fact that
2**-52  10**-15.  The above doesn't work for subnormals, but I'm guessing 
that people don't care too much about these.  The argument above also 
depends on correct rounding, but there's actually sufficient leeway here 
(that is, 2**-52 is quite substantially smaller than 10**-15) that the 
whole thing would probably work even in the absence of correct rounding.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-08 Thread Mark Dickinson

Mark Dickinson [EMAIL PROTECTED] added the comment:

For what it's worth, I'm -0.1 (or should that be -0.10001?) on 
this change.  It seems better to leave the problems caused by binary 
floating-point out in the open than try to partially hide them, and the 
proposed change just seems a little bit too 'magic'.  The inconsistency in 
the following current behaviour *does* irk me slightly (but only 
slightly):

 0.13
0.13
 0.14
0.14001

But practical issues aside, my preference would be to go to the other 
extreme: fix this inconsistency by changing the first output to 
0.13000 rather than changing the second output to 0.14.  That 
is, have repr(float) *always* output 17 digits, possibly except when that 
float is *exactly* representable in 16 or fewer decimal digits (e.g. 3.5, 
0.125, ...).  All the zeros act as a subtle reminder that the stored value 
is not exactly 0.13.  But I suspect this, too, isn't very practical.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-06 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Here's a fixed patch, float2.diff. (The previous one tasted of an
earlier attempt.)

Added file: http://bugs.python.org/file10840/float2.diff

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

I'd like to reopen this. I'm still in favor of something like to this
algorithm:

def float_repr(x):
  s = %.16g % x
  if float(s) != x:
s = %.17g % x
  s1 = s
  if s1.startswith('-'):
s1 = s[1:]
  if s1.isdigit():
s += '.0'  # Flag it as a float
  # XXX special case inf and nan, see floatobject.c::format_double
  return s

combined with explicitly using %.17g or the new hex notation (see issue
3008) in places where cross-platform correctness matters, like pickle
and marshal.

This will ensure float(repr(x)) == x on all platforms, but not cross
platforms -- and I don't care.

I'm fine with only doing this in 3.0.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-03 Thread Tim Peters

Tim Peters [EMAIL PROTECTED] added the comment:

If you think using 16 (when possible) will stop complaints, think again
;-)  For example,

 for x in 0.07, 0.56:
... putatively_improved_repr = %.16g % x
... assert float(putatively_improved_repr) == x
... print putatively_improved_repr
0.07001
0.5601

(those aren't the only 2-digit examples, but I expect those specific
examples work the same under Linux and Windows).

IOW, 16 doesn't eliminate base-conversion surprises, except at the 8
single-digit surprises (i/10.0 for i in range(1, 5) + range(6, 10)).

To eliminate 2-digit surprises (like 0.07 and 0.56 above), and higher,
 in this way, you need to add a loop and keeping trying smaller
conversion widths so long as they convert back to the original input.

Keep it up, and you can make repr(float) so slow it would be faster to
do perfect rounding in Python ;-)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-07-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

That is truly maddening! :-(

I guess Noam's proposal to return str(x) if float(str(x)) == x makes
more sense then.  I don't really care as much about 1.234567890123 vs.
1.234567890122 as I care about 1.2345 vs. 1.2344.

(This comment is supposed to *precede* the patch.)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-06-29 Thread Mark Dickinson

Changes by Mark Dickinson [EMAIL PROTECTED]:


--
nosy: +marketdickinson

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2008-06-09 Thread Alexandre Vassalotti

Changes by Alexandre Vassalotti [EMAIL PROTECTED]:


--
nosy: +alexandre.vassalotti

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-30 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc added the comment:

 If someone has a more recent version of MS's compiler, 
 I'd be interested to know what this does:

Visual Studio 2008 Express Edition gives the same results:
['1024', '1024', '1024', '1024', '1024', '1024', '1024.0001']
(Tested release and debug builds, with trunk version and py3k branch)

--
nosy: +amaury.forgeotdarc

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-22 Thread Tim Peters

Tim Peters added the comment:

If someone has a more recent version of MS's compiler, I'd be interested
to know what this does:

inc = 2.0**-43
base = 1024.0
xs = ([base + i*inc for i in range(-4, 0)] +
  [base] +
  [base + 2*i*inc for i in (1, 2)])
print xs
print [%.16g % x for x in xs]

That creates 7 distinct doubles all of which map to 1024 when
correctly rounded to 16 significant digits.  And that's what the Cygwin
Python 2.5.1 (which presumably uses David Gay's correct-rounding
conversion routines from glibc) displays for the last line:

['1024', '1024', '1024', '1024', '1024', '1024', '1024']

The released Windows Python 2.5.1 displays this instead:

['1024', '1024', '1024', '1024', '1024', '1024', '1024.0001']

That's a pretty gross rounding error, since the exact value of the last
element is

1024.45474735088646411895751953125

and so the 16'th digit should indeed not round up to 1.  It's a pretty
gross error because the rounded-off part isn't particularly close to a
half unit in the last (16'th) place.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Noam Raphael

Noam Raphael added the comment:

I think that we can give up float(repr(x)) == x across different
platforms, since we don't guarantee something more basic: We don't
guarantee that the same program doing only floating point operations
will produce the same results across different 754 platforms, because
in the compilation process we rely on the system's decimal to binary
conversion. In other words, using the current repr(), one can pass a
value x from platform A platform B and be sure to get the same value.
But if he has a python function f, he can't be sure that f(x) on
platform A will result in the same value as f(x) on platform B. So the
cross-platform repr() doesn't really matter.

I like eval(repr(x)) == x because it means that repr(x) captures all
the information about x, not because it lets me pass x from one
platform to another. For communication, I use other methods.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Christian Heimes

Christian Heimes added the comment:

Tim Peters wrote:
 This has nothing to do with what will or won't satisfy me, either.  I'm
 happy with what Python currently does, which is to rely on #3 above. 
 That's explainable (what's hard about understanding %.17g?), and
 relies only on what the 754 standard requires.
 
 There is no easy way around this, period.  If you're willing to give up
 on float(repr(x)) == x cross-754-platform in some cases (and are willing
 to be unable to spell out /which/ cases), then many easy alternates exist.
 
 Maybe this is a missing fact:  I don't care whether float(repr(x)) == x.
  It would, for example, be fine by me if Python used a plain %.8g format
 for str and repr.  But I'm not indulging wishful thinking about the
 difficulties in ensuring float(repr(x)) == x either ;-)

Time is right.

The only way to ensure that it works on all platforms is a full and
mathematical correct proof of the algorithm and all functions used by
the algorithms. In the mean time *we* are going to release Python 40k
while *you* are still working on the proof. *scnr* :)

I like to propose a compromise. The patch is about making the
representation of a float nicer when users - newbies and ordinary people
w/o mathematical background - work on a shell or do some simple stuff.
They want a pretty and short representation of

2.2
 My value is %s % (11./5.)
'My value is 2.2'

The latter already prints 2.2 without the patch but the former prints
2.2002 on my machine. Now here comes the interesting part.
In 2.6 the shell uses PyObject_Print and the tp_print slot to print a
float. We can overwrite the str() and tp_print slot to use the new
algorithm while keeping the old algorithm for repr().

Unfortunately the tp_print slot is no longer used in Python 3.0. It
could be resurrected for this purpose (object.c:internal_print()).

str(1.) - new code
repr(1.) - old and good representation
 1. - new code

Christian

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

ISTM shorter repr's are inherently misleading and will make it more 
harder to diagnose why 1.1 * 3 != 3.3 or why round(1.0 % 0.1, 1) == 0.1 
which is *very* far-off of what you might expect. 

The 17 digit representation is useful in that it suggests where the 
problem lies.  In contrast, showing two numbers with reprs of different 
lengths will strongly suggest that the shorter one is exactly 
represented.  Currently, that is a useful suggestion, 10.25 shows as 
10.25 while 10.21 shows as 10.211 (indicating that the 
latter is not exactly represented).  If you start showing 1.1 as 1.1, 
then you've lost both benefits.

FWIW, I prefer the current 17 digit behavior, its speed, its cross-754 
round-trip guarantees, and it psychological cues about exactness.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Noam Raphael

Noam Raphael added the comment:

2007/12/18, Raymond Hettinger [EMAIL PROTECTED]:
 The 17 digit representation is useful in that it suggests where the
 problem lies.  In contrast, showing two numbers with reprs of different
 lengths will strongly suggest that the shorter one is exactly
 represented.  Currently, that is a useful suggestion, 10.25 shows as
 10.25 while 10.21 shows as 10.211 (indicating that the
 latter is not exactly represented).  If you start showing 1.1 as 1.1,
 then you've lost both benefits.

Currently, repr(1.3) == '1.3', suggesting that it is exactly
represented, which isn't true. I think that unless you use an
algorithm that will truncate zeros only if the decimal representation
is exact, the suggested algorithm is less confusing than the current
one, in that it doesn't suggest that 1.3 is exactly stored and 1.1
isn't.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Right, there are plenty of exceptions to the suggestion of exactness. 
Still, I find the current behavior to be more helpful than not 
(especially when trying to explain the examples I gave in the previous 
post).

I'm concerned that the tone of the recent discussion shows a strong 
desire to just get something checked-in despite dubious benefits and 
despite long standing invariants.

It is a bit cavalier to declare we can give up float(repr(x)) == x 
across different platforms.  Our NumPy friends might not be amused. If 
a PEP on this were circulated, I'm sure we would receive some frank and 
specific feedback.

Likewise, I find it questionable that shorter reprs are helpful. Is 
there any evidence supporting that assertion?  My arguments above 
suggest short reprs are counter-productive because they mask the cause 
of problems that might otherwise be self-evident.  Of course, it's 
possible that both could be true, short reprs may help in some cases 
(by masking oddities) and may hurt in other cases (by masking the 
underlying cause of an oddity or by making a false psychological 
suggestion).

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Noam Raphael

Noam Raphael added the comment:

About the educational problem. If someone is puzzled by 1.1*3 !=
3.3, you could always use '%50f' % 1.1 instead of repr(1.1). I don't
think that trying to teach people that floating points don't always do
what they expect them to do is a good reason to print uninteresting
and visually distracting digits when you don't have to.

About the compatibility problem: I don't see why it should matter to
the NumPy people if the repr() of some floats is made shorter. Anyway,
we can ask them, using a PEP or just the mailing list.

About the benefit: If I have data which contains floats, I'm usually
interested about their (physical) value, not about their last bits.
That's why str(f) does what it does. I like repr(x) to be one-to-one,
as I explained in the previous message, but if it can be made more
readable, why not make it so?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Guido van Rossum

Guido van Rossum added the comment:

[Tim: when I said bugs I just meant non-correct rounding. Sorry.]

On the educational issue: it's still embarrassingly easy to run into
situations where *arithmetic* using floats produces educational
results.  Simplest case I could find quickly: 0.1+0.2 != 0.3.

But I think this is separate from the more directly baffling

 0.1
0.10001
 

I propose the following set of changes:

(1) Get rid of doubledigits.c.  IMO it is evil because it uses static
global variables.

(2) Change pickle.py, cPickle.c and marshal.c to use %.17g instead of
repr.  This means that these serialization protocols are not affected by
incorrectly-rounding atof() implementations since the IEEE-754 standard
guarantees that 17 digits input will produce the correct value
regardless of rounding correctness.

(3) change repr(float) so as to use %.16g if the result roundtrips
using atof() and %.17g if not.  (*)

This means that platforms where atof() is not correctly rounding may
produce incorrect results when the input was produced by repr() -- even
when the repr() was run on a platform whose atof() rounds correctly, IIU
Tim correctly.  I'm okay with that.  It's still better than changing
repr() to always use %.16g or even %.12g, which would really turn the
educational issue into an educational nightmare, trying to explain why
two numbers both print the same but compare unequal (and God forbid
anyone proposes adding fuzz to float ==).

(*) If atof() is much slower than %.17g, we could compute %.17g and
%.16g first and if they are the same string avoid the atof() call
altogether.  We could even avoid the %.16g if the %17g step produced
at least one (suppressed) trailing zero.  OTOH if atof() is about the
same speed as %.17g or faster, it makes more sense to try %.16g and
atof() first, and if this doesn't roundtrip, fall back to %.17g.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Skip Montanaro

Skip Montanaro added the comment:

Guido ... trying to explain why two numbers both print the same but
Guido compare unequal ...

This is not a Python-specific issue.  The notion of limited precision was
pounded into our heads in the numerical analysis class I took in college,
1980-ish.  I'm pretty sure that was before both Python and IEEE-754.  I'm
pretty sure I programmed for that class in Fortran.

Guido ... (and God forbid anyone proposes adding fuzz to float ==).

Is it possible to produce a FuzzyFloat class which is a subclass of float
that people could use where they needed it?  Its constructor could take a
tolerance arg, or it could be initialized from a global default.  I'm not
suggesting that suchh a class replace float in the core, but that it might
be useful to have it available in some contexts (available via PyPI, of
course).  I'm offline at the moment or I'd poke around before asking.

Skip

--
nosy: +skip.montanaro

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-18 Thread Tim Peters

Tim Peters added the comment:

Guido, right, for that to work reliably, double-str and str-double
must both round correctly on the platform doing the repr(), and
str-double must round correctly on the platform reading the string.

It's quite easy to understand why at a high level:  a simple (but
tedious) pigeonhole argument shows that there are more representable
doubles than there are possible 16-digit decimal outputs.  Therefore
there must exist distinct doubles (say, x and y) that map to the same
16-digit decimal output (in fact, there are cases where 7 consecutive
doubles all map to the same correctly-rounded 16-digit decimal string).
 When reading that back in, there's no room for error in deciding which
of {x, y} was /intended/ -- the str-double conversion has to be
flawless in deciding which of {x, y} is closest to the 16-digit decimal
input, and in case x and y are equally close, has to apply 754's
round-to-nearest/even rule correctly.  This can require exact
multi-thousand bit arithmetic in some cases, which is why the 754
standard doesn't require it (thought to be impractically expensive at
the time).

 which would really turn the educational issue into an educational
 nightmare, trying to explain why two numbers both print the same
 but compare unequal

Well, newbies see that in, e.g., spreadsheets every day.  You generally
have to work to get a spreadsheet to show you all the digits; in some
spreadsheets it's not even possible.  Not all digits are displayed
should be within the grasp of must students without suffering nightmares ;-)

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-17 Thread Noam Raphael

Noam Raphael added the comment:

Ok, I think I have a solution!

We don't really need always the shortest decimal representation. We just
want that for most floats which have a nice decimal representation, that
representation will be used. 

Why not do something like that:

def newrepr(f):
r = str(f)
if eval(r) == f:
return r
else:
return repr(f)

Or, in more words:

1. Calculate the decimal representation of f with 17 precision digits,
s1, using the system's routines.
2. Create a new string, s2, by rounding the resulting string to 12
precision digits.
3. Convert the resulting rounded string to a new double, g, using the
system's routines.
4. If f==g, return s2. Otherwise, return s1.

It will take some more time than the current repr(), because of the
additional decimal to binary conversion, but we already said that if
speed is extremely important one can use '%f.17' % f. It will
obviously preserve the eval(repr(f)) == f property. And it will return a
short representation for almost any float that has a short representation.

This algorithm I will be glad to implement.

What do you think?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-17 Thread Guido van Rossum

Guido van Rossum added the comment:

This is what I was thinking of before, although I'd use %.16g%f and
%.17g%f instead of str(f) and repr(f), and I'd use float() instead
of eval().

I suspect that it doesn't satisfy Tim Peters though, because this may
depend on a rounding bug in the local platform's input function.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-17 Thread Tim Peters

Tim Peters added the comment:

It's not a question of bugs.  Call the machine writing the string W and
the machine reading the string R.  Then there are 4 ways R can get back
the double W started with when using the suggested algorithm:

1. W and R are the same machine.  This is the way that's most obvious.

The rest apply when W and R are different machines (by which I really
mean both use 754 doubles, but use different C libraries for
double-string conversions):

2. W and R both do correct-rounding string-double.  This is obvious too
 (although it may not be obvious that it's /not/ also necessary that W
do correct-rounding double-string).

3. W and R both conform to the 754 standard, and W did produce 17
significant digits (or fewer if trailing zeroes were chopped).  This one
is neither obvious nor non-obvious:  that it works is something the
754 standard requires.

4. W produced fewer than 17 significant digits, and W and/or R do not do
correct rounding.  Then it works, or fails, by luck, and this has
nothing to do whether the double-string algorithms have bugs.  There
are countless ways to code such routines that will fail in some cases --
doing correct rounding in all cases is the only way that won't fail in
some cases of producing fewer than 17 significant digits (and exactly
which cases depend on all the details of the specific
non-correct-rounding algorithms W and/or R implement).

This has nothing to do with what will or won't satisfy me, either.  I'm
happy with what Python currently does, which is to rely on #3 above. 
That's explainable (what's hard about understanding %.17g?), and
relies only on what the 754 standard requires.

There is no easy way around this, period.  If you're willing to give up
on float(repr(x)) == x cross-754-platform in some cases (and are willing
to be unable to spell out /which/ cases), then many easy alternates exist.

Maybe this is a missing fact:  I don't care whether float(repr(x)) == x.
 It would, for example, be fine by me if Python used a plain %.8g format
for str and repr.  But I'm not indulging wishful thinking about the
difficulties in ensuring float(repr(x)) == x either ;-)

BTW, about http://www.cant.ua.ac.be/ieeecc754.html:  passing that test
does not guarantee the platform does correct rounding.  It uses a fixed
set of test cases that were in fact obtained from running the algorithm
I invented for efficiently finding hard cases in the early 90s (and
implemented by Vern Paxson, and then picked up by the authors of the
test suite you found).  It's very likely that an implementation that
doesn't do correct rounding will fail on at least one of those cases,
but not certain.  Exhaustive testing is impossible (too many cases).

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-13 Thread Noam Raphael

Noam Raphael added the comment:

2007/12/13, Guido van Rossum [EMAIL PROTECTED]:

  Ok, so if I understand correctly, the ideal thing would be to
  implement decimal to binary conversion by ourselves. This would make
  str - float conversion do the same thing on all platforms, and would
  make repr(1.1)=='1.1'. This would also allow us to define exactly how
  floats operate, with regard to infinities and NaNs. All this is for
  IEEE-754 platforms -- for the rare platforms which don't support it,
  the current state remains.

 Does doubledigits.c not work for non-754 platforms?

No. It may be a kind of an oops, but currently it just won't compile
on platforms which it doesn't recognize, and it only recognizes 754
platforms.

  2. Keep the binary to shortest decimal routine and use it only when we
  know that the system's decimal to binary routine is correctly rounding
  (we can check - perhaps Microsoft has changed theirs?)

 Tim says you can't check (test) for this -- you have to prove it from
 source, or trust the vendor's documentation. I would have no idea
 where to find this documented.

The program for testing floating point compatibility is in
http://www.cant.ua.ac.be/ieeecc754.html

To run it, on my computer, I used:
./configure -target Conversions -platform IntelPentium_cpp
make
./IeeeCC754 -d -r n -n x Conversion/testsets/d2bconvd
less ieee.log

This tests only doubles, round to nearest, and ignores flags which
should be raised to signal inexact conversion. You can use any file in
Conversions/testsets/d2b* - I chose this one pretty randomly.

It turns out that even on my gcc 4.1.3 it finds a few floats not
correctly rounded. :(

Anyway, it can be used to test other platforms. If not by the
executable itself, we can pretty easily write a python program which
uses the test data.

I don't know what exactly the errors with gcc 4.1.3 mean - is there a
problem with the algorithm of glibc, or perhaps the testing program
didn't set some flag?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-12 Thread Noam Raphael

Noam Raphael added the comment:

Ok, so if I understand correctly, the ideal thing would be to
implement decimal to binary conversion by ourselves. This would make
str - float conversion do the same thing on all platforms, and would
make repr(1.1)=='1.1'. This would also allow us to define exactly how
floats operate, with regard to infinities and NaNs. All this is for
IEEE-754 platforms -- for the rare platforms which don't support it,
the current state remains.

However, I don't think I'm going, in the near future, to add a decimal
to binary implementation -- the Tcl code looks very nice, but it's
quite complicated and I don't want to fiddle with it right now.

If nobody is going to implement the correctly rounding decimal to
binary conversion, then I see three options:
1. Revert to previous situation
2. Keep the binary to shortest decimal routine and use it only when we
know that the system's decimal to binary routine is correctly rounding
(we can check - perhaps Microsoft has changed theirs?)
3. Keep the binary to shortest decimal routine and drop repr(f) == f
(I don't like that option).

If options 2 or 3 are chosen, we can check the 1e5 bug.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-12 Thread Guido van Rossum

Guido van Rossum added the comment:

 Ok, so if I understand correctly, the ideal thing would be to
 implement decimal to binary conversion by ourselves. This would make
 str - float conversion do the same thing on all platforms, and would
 make repr(1.1)=='1.1'. This would also allow us to define exactly how
 floats operate, with regard to infinities and NaNs. All this is for
 IEEE-754 platforms -- for the rare platforms which don't support it,
 the current state remains.

Does doubledigits.c not work for non-754 platforms?

 However, I don't think I'm going, in the near future, to add a decimal
 to binary implementation -- the Tcl code looks very nice, but it's
 quite complicated and I don't want to fiddle with it right now.

Fair enough. And the glibc code is better anyway, according to Tim.
(Although probably even more complex.)

 If nobody is going to implement the correctly rounding decimal to
 binary conversion, then I see three options:
 1. Revert to previous situation

That would be sad since we're so close.

 2. Keep the binary to shortest decimal routine and use it only when we
 know that the system's decimal to binary routine is correctly rounding
 (we can check - perhaps Microsoft has changed theirs?)

Tim says you can't check (test) for this -- you have to prove it from
source, or trust the vendor's documentation. I would have no idea
where to find this documented.

 3. Keep the binary to shortest decimal routine and drop repr(f) == f
 (I don't like that option).

It would mean that on platforms with unknown-correct rounding *input*
the *output* would be ugly again. I wouldn't mind so much if this was
only VAX or Cray; but if we had to do this on Windows it'd be a real
shame.

 If options 2 or 3 are chosen, we can check the 1e5 bug.

First you'll have to reproduce it -- I can't afford to spend much
quality time with gdb, especially since I don't understand the source
code at all. Though you may be able to pinpoint a possible candidate
based on the characteristics of the error -- it seems to have to do
with not placing the decimal point correctly when there are trailing
zeros.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

The Tcl code can be fonund here:
http://tcl.cvs.sourceforge.net/tcl/tcl/generic/tclStrToD.c?view=markup

What Tim says gives another reason for using that code - it means that
currently, the compilation of the same source code on two platforms can
result in a code which does different things.

Just to make sure - IEEE does require that operations on doubles will do
the same thing on different platforms, right?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Christian Heimes

Christian Heimes added the comment:

It's really a shame. It was a nice idea ...

Could we at least use the new formating for str(float) and the display
of floats? In Python 2.6 floats are not displayed with repr(). They seem
to use yet another hook.

 repr(11./5)
'2.2'
 11./5
2.2002
 str(11./5)
'2.2'

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

I think that for str(), the current method is better - using the new
repr() method will make str(1.1*3) == '3.3003', instead of
'3.3'. (The repr is right - you can check, and 1.1*3 != 3.3. But for
str() purposes it's fine.)

But I actually think that we should also use Tcl's decimal to binary
conversion - otherwise, a .pyc file created by python compiled with
Microsoft will cause a different behaviour from a .pyc file created by
python compiled with Gnu, which is quite strange.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

If I think about it some more, why not get rid of all the float
platform-dependencies and define how +inf, -inf and nan behave?

I think that it means:
* inf and -inf are legitimate floats just like any other float.
Perhaps there should be a builtin Inf, or at least math.inf.
* nan is an object of type float, which behaves like None, that is:
nan == nan is true, but nan  nan and nan  3 will raise an
exception. Mathematical operations which used to return nan will raise
an exception (division by zero does this already, but inf + -inf
will do that too, instead of returning nan.) Again, there should be a
builtin NaN, or math.nan. The reason for having a special nan object
is compatibility with IEEE floats - I want to be able to pass around
IEEE floats easily even if they happen to be nan.

This is basically what Tcl did, if I understand correctly - see item 6
in http://www.tcl.tk/cgi-bin/tct/tip/132.html .

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

‎That's right, but the standard also defines that 0.0/0 - nan, and
1.0/0 - inf, but instead we raise an exception. It's just that in
Python, every object is expected to be equal to itself. Otherwise, how
can I check if a number is nan?

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Christian Heimes

Christian Heimes added the comment:

Noam Raphael wrote:
 * nan is an object of type float, which behaves like None, that is:
 nan == nan is true, but nan  nan and nan  3 will raise an
 exception. 

No, that's not correct. The standard defines that nan is always unequal
to nan.

False
 float(inf) == float(inf)
True
 float(inf) == -1*float(-inf)
True

The float module could gain three singletons nan, inf an neginf so you
can do

True

Christian

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Christian Heimes

Christian Heimes added the comment:

I propose that we add three singletons to the float implementation:

PyFloat_NaN
PyFloat_Inf
PyFloat_NegInf

The singletons are returned from PyFloat_FromString() for nan, inf
and -inf. The other PyFloat_ method must return the singletons, too.
It's easy to check the signum, special and nan/inf status of an IEEE
double with a struct.

I've already started to work on a way to create nan, inf and -inf cross
platform conform. Right now float(nan), float(inf) and float(-inf)
works on Linux but doesn't work on Windows. We could also add four
methods to math:

signum(float) - 1/-1, finity(float) - bool, infinite(float), nan(float)

Here is my work in progress:
#define Py_IEEE_DBL *(double*)(ieee_double)
#if defined(LITTLE_ENDIAN_IEEE_DOUBLE)  !defined(BIG_ENDIAN_IEEE_DOUBLE)
typedef struct {
unsigned intm4 : 16;
unsigned intm3 : 16;
unsigned intm2 : 16;
unsigned intm1 :  4;
unsigned int   exp : 11;
unsigned int  sign :  1;
} ieee_double;
#define Py_IEEE_NAN Py_IEEE_DBL{0x, 0x, 0x, 0xf, 0x7ff, 0}
#define Py_IEEE_INF Py_IEEE_DBL{0, 0, 0, 0, 0x7ff, 0}
#define Py_IEEE_NEGINF Py_IEEE_DBL{0, 0, 0, 0, 0x7ff, 1}
#elif !defined(LITTLE_ENDIAN_IEEE_DOUBLE)  defined(BIG_ENDIAN_IEEE_DOUBLE)
typedef struct {
unsigned int  sign :  1;
unsigned int   exp : 11;
unsigned intm1 :  4;
unsigned intm2 : 16;
unsigned intm3 : 16;
unsigned intm4 : 16;
} ieee_double;
#define Py_IEEE_NAN Py_IEEE_DBL{0, 0x7ff, 0xf, 0x, 0x, 0x}
#define Py_IEEE_INF Py_IEEE_DBL{0, 0x7ff, 0, 0, 0, 0}
#define Py_IEEE_NEGINF Py_IEEE_DBL{1, 0x7ff, 0, 0, 0, 0}
#else
#error Unknown or no IEEE double implementation
#endif

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

(1) Despite Tim's grave language, I don't think we'll need to write our
own correctly-rounding float input routine.  We can just say that Python
won't work correctly unless your float input routine is rounding
correctly; a unittest should detect whether this is the case.  I expect
that more recent C runtimes for Windows are correct.

(1a) Perhaps it's better to only do this for Python 3.0, which has a
smaller set of platforms to support.

(2) Have you two (Christian and Noam) figured out yet why repr(1e5) is
'1.0' on some platforms?  If it's not byte order (and I no longer
believe it is) what could it be?  An uninitialized variable?  It doesn't
seem to vary with the processor, but it could vary with the C runtime or OS.

(3) Detecting NaNs and infs in a platform-independent way is tricky,
probably beyond you (I know it's beyond me).  (Of course, producing
standardized output for them and recognizing these as input is easy, and
that should definitely be done.)

(4) Looks like we've been a bit too hasty checking this in.  Let's be
extra careful for round #2.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Christian Heimes

Christian Heimes added the comment:

Guido van Rossum wrote:
 (1a) Perhaps it's better to only do this for Python 3.0, which has a
 smaller set of platforms to support.

+1
Does Python depend on a working, valid and non-broken IEEE 754 floating
point arithmetic? Could we state the Python's float type depends on
IEEE754-1985 conform 64bit double precision format. Developers for a
platform with no IEEE754 conform double must role their own floatobject
implementation.

 (2) Have you two (Christian and Noam) figured out yet why repr(1e5) is
 '1.0' on some platforms?  If it's not byte order (and I no longer
 believe it is) what could it be?  An uninitialized variable?  It doesn't
 seem to vary with the processor, but it could vary with the C runtime or OS.

I wasn't able to reproduce the problem on my work stations. Can you give
us more information about the computer with failing tests? CPU, OS,
Kernel, libc etc.

 (3) Detecting NaNs and infs in a platform-independent way is tricky,
 probably beyond you (I know it's beyond me).  (Of course, producing
 standardized output for them and recognizing these as input is easy, and
 that should definitely be done.)

In fact detecting NaNs and Infs is platform-independent is dead easy IFF
 the platform has IEEE 754 support. A double uses 64bit to store the
data, 1 bit signum, 11 bit exponent and 53bit fraction part (aka
mantissa). When all bits of the exponent are set (0x7ff) the number is
either a NaN or +/-Inf. For Infs the fraction part has no bits set (0)
and the signum bit signals +Inf (sig: 0) or -Inf (sig: 1). If one to all
bits of the mantissa is set, it's a NaN.

The form (normalized or denormalized) is regardless for the detection of
NaN and Inf. An autoconf macro can easily detect the endianness of the
double implementation.

We can also use the much slower version and check if x  DBL_MAX for
+Inf, x  -DBL_MAX for -Inf and x != x for NaN.

Christian

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

  (1a) Perhaps it's better to only do this for Python 3.0, which has a
  smaller set of platforms to support.

 +1
 Does Python depend on a working, valid and non-broken IEEE 754 floating
 point arithmetic? Could we state the Python's float type depends on
 IEEE754-1985 conform 64bit double precision format. Developers for a
 platform with no IEEE754 conform double must role their own floatobject
 implementation.

No, traditionally Python has just used whatever C's double provides.

There are some places that benefit from IEEE 754, but few that require
it (dunno about optional extension modules).

  (2) Have you two (Christian and Noam) figured out yet why repr(1e5) is
  '1.0' on some platforms?  If it's not byte order (and I no longer
  believe it is) what could it be?  An uninitialized variable?  It doesn't
  seem to vary with the processor, but it could vary with the C runtime or OS.

 I wasn't able to reproduce the problem on my work stations. Can you give
 us more information about the computer with failing tests? CPU, OS,
 Kernel, libc etc.

So far I have only one box where it is broken (even after make clobber
and ./configure). I cannot reproduce it with a debug build.

It's an x86 Linux box running in mixed 64-32 bit mode.

From /etc/lsb-release: Ubuntu 6.06 LTS
uname -a: Linux pythonic.corp.google.com 2.6.18.5-gg16-mixed64-32 #1
SMP Wed Oct 10 17:45:08 PDT 2007 x86_64 GNU/Linux
gcc -v: gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)

I'm afraid I'll have to debug this myself, but not today.

  (3) Detecting NaNs and infs in a platform-independent way is tricky,
  probably beyond you (I know it's beyond me).  (Of course, producing
  standardized output for them and recognizing these as input is easy, and
  that should definitely be done.)

 In fact detecting NaNs and Infs is platform-independent is dead easy IFF
  the platform has IEEE 754 support. A double uses 64bit to store the
 data, 1 bit signum, 11 bit exponent and 53bit fraction part (aka
 mantissa). When all bits of the exponent are set (0x7ff) the number is
 either a NaN or +/-Inf. For Infs the fraction part has no bits set (0)
 and the signum bit signals +Inf (sig: 0) or -Inf (sig: 1). If one to all
 bits of the mantissa is set, it's a NaN.

 The form (normalized or denormalized) is regardless for the detection of
 NaN and Inf. An autoconf macro can easily detect the endianness of the
 double implementation.

Only for IEEE 754 though.

 We can also use the much slower version and check if x  DBL_MAX for
 +Inf, x  -DBL_MAX for -Inf and x != x for NaN.

Of course the latter isn't guaranteed to help for non-IEEE-754
platforms -- some platforms don't have NaNs at all!

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Raymond Hettinger

Raymond Hettinger added the comment:

 Of course the latter isn't guaranteed to help for 
 non-IEEE-754 platforms -- some platforms don't have
 NaNs at all! 

ISTM, that years of toying with Infs and Nans has not
yielded a portable, workable solution.  I'm concerned
that further efforts will contort and slow-down our
code without benefit.  NaNs in particular are a really
difficult case because our equality testing routines
have a fast path where identity implies equality.  I don't
think NaNs can be correctly implemented without breaking
that fast path which appears throughout the code base
(even inside the code for dictobject.c; otherwise, how
could you lookup a NaN value).

I recommend punting on the subject of NaNs.  Attempting
to fix it will necessarily entail introducing a 
number of little atrocities that just aren't worth it.
IMO, Practicality beats purity on this one.

The decimal module provides full support for NaNs.
While it is slow, it may be sufficient for someone who
isn't happy with the way float handles them.

--
nosy: +rhettinger

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Tim Peters

Tim Peters added the comment:

[Raymond]
 ...
 NaNs in particular are a really
 difficult case because our equality testing routines
 have a fast path where identity implies equality.

Works as intended in 2.5; this is Windows output:

1.#INF
 nan = inf - inf
 nan  # really is a NaN
-1.#IND
 nan is nan  # of course this is true
True
 nan == nan  # but not equal anyway
False

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Tim Peters

Tim Peters added the comment:

[Guido]
 ...  We can just say that Python
 won't work correctly unless your float input routine is rounding
 correctly; a unittest should detect whether this is the case.

Sorry, but that's intractable.  Correct rounding is a property that
needs to be proved, not tested.  Processors aren't fast enough to test
all (roughly) 2**64 possible doubles in reasonable time, and there's
only one way to get all of them right (correct rounding).  There are
many ways to get at least one of them wrong, and short of proof only
exhaustive testing can determine whether at least one is wrong.  There
is no smoking gun specific test case, either -- different flawed
routines will fail on different inputs.

 I expect that more recent C runtimes for Windows are correct.

Based on what?  Unless MS says so somewhere, it's very hard to know.

Vern Paxson and David Hough developed a test program in the early
90's, based on an excruciatingly difficult algorithm I invented for
finding hard cases, and the code they wrote still appears to be
available from the places Vern mentions in this brief msg:

http://www.netlib.org/fp/testbase

That code doesn't guarantee to find a failing correct-rounding case if
one exists, but did find failures reasonably efficiently on many
platforms tested at the time.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Christian Heimes

Christian Heimes added the comment:

Guido van Rossum wrote:
 No, traditionally Python has just used whatever C's double provides.
 
 There are some places that benefit from IEEE 754, but few that require
 it (dunno about optional extension modules).

I asked Thomas Wouter about IEEE 754:

I don't know of any system with FPU that does not support 754 semantics.
and all intel CPUs since the 486 have FPUs.


During the discussion we found only one system that has supported Python
in the past but doesn't have IEEE 754 doubles. Pippy (Python for PalmOS)
has no floats at all since the hardware doesn't haven a FPU.

 So far I have only one box where it is broken (even after make clobber
 and ./configure). I cannot reproduce it with a debug build.
 
 It's an x86 Linux box running in mixed 64-32 bit mode.
 
 From /etc/lsb-release: Ubuntu 6.06 LTS
 uname -a: Linux pythonic.corp.google.com 2.6.18.5-gg16-mixed64-32 #1
 SMP Wed Oct 10 17:45:08 PDT 2007 x86_64 GNU/Linux
 gcc -v: gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
 
 I'm afraid I'll have to debug this myself, but not today.

The problem could either be related to SMP or to the mixed mode. Please
try to bind the Python process to one CPU with schedtool or schedutils.

 Only for IEEE 754 though.
 Of course the latter isn't guaranteed to help for non-IEEE-754
 platforms -- some platforms don't have NaNs at all!

Do you know of any system that supports Python and floats but doesn't
have IEEE 753 semantics? I'm asking out of curiosity.

Christian

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

 Do you know of any system that supports Python and floats but doesn't
have IEEE 753 semantics?

(Assuming you meant 754.)

I'm pretty sure the VAX doesn't have IEEE FP, and it used to run Unix
and Python.  Ditto for Crays -- unsure if we still support that though.
 What about Nokia S60?

Tim probably knows more.

--
priority: high - normal

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

 Correct rounding is a property that needs to be proved, not tested.

I take it your position is that this can never be done 100% correctly so
it shouldn't go in?  That's disappointing, because the stream of
complaints that round is broken won't stop (we had two of these in the
bug tracker just this month).

I really want to address this without having to either change
interactive mode to use something besides repr() or changing repr() to
drop precision (assuming correctly rounding input).

We can fix the roundtripping in marshal and pickle and the like by
explicitly using %.17g % x instead of repr(x).  Scientists who worry
about roundtripping can also do this (though are there any scientists
who work on platforms that don't have correctly-rounding input routines?).

We can also fix it by adopting the Tcl string-to-float code.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Tim Peters

Tim Peters added the comment:

[Guido]
 I take it your position is that this can never be done 100% correctly

No.  David Gay's code is believed to be 100% correctly-rounded and is
also reasonably fast in most cases.  I don't know of any other open
string-float code that achieves both ( expect that's why glibc
adopted Gay's code as its base).  I don't know anything about Tcl's
code here, except that, unlike Gay's code, it appears that Tcl's code
also tries to cater to non-754 platforms.  It's also easy to be 100%
correct if you don't care about speed (but that's factors of 1000s
slower than Gay's code).

If you care about guaranteeing eval(repr(x)) == x for binary floats,
sorry, there's no way to do it that satisfies all of {cheap, easy,
fast, correct}.  If you don't care about guaranteeing eval(repr(x)) ==
x, then it's dead easy to do any number of wrong things instead.
For example, much easier and faster than anything discussed here would
be to use the float str() implementation instead.  Produce the
shortest possible output that reproduces the same input is a
genuinely difficult task (for both output and input) when speed
matters, but it's not really what people are asking for when they see
0.10001.  They would be just as happy with str(x)'s output
instead:  it's not people who know a lot about floats who are asking
to begin with, it's people who don't know much at all.

It's not a coincidence that the whole idea of shortest possible
output came from the Scheme world ;-)  That is, formally correct at
any price.

 so it shouldn't go in?

What shouldn't go in is something that doesn't solve the problem,
whatever that's thought to be today.  Changing repr() to do
shortest-possible output is not, by itself, a solution /if/ the
problem is thought to be getting rid of lots of zeroes or nines while
preserving eval(repr(x)) == x.  Slows things down yet isn't enough to
guarantee eval(repr(x)) == x.  This is just going in circles.

 That's disappointing, because the stream of complaints that round is broken 
 won't stop (we
 had two of these in the bug tracker just this month).

I don't care about complaints from people who use binary floating
point without understanding it; they should use the decimal module
instead if they're unwilling or unable to learn.  Or give up on
eval(repr(x)) == x for binary floats and then it's all much easier --
although people using binary floats naively will get burned by
something subtler then.

A sufficient response to round() is broken complaints is to point to
the Tutorial appendix that specifically discusses round() is broken
complaints.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

I'd be willing to require eval(repr(x)) == x only for platforms whose
float input routine is correctly rounding.  That would make the current
patch acceptable I believe -- but I believe you think there's a better
way in that case too?  What way is that?

Also, I'd rather not lock out non-IEEE-754 platforms, but I'm OK with
dropping the above roundtripping invariant there.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

If I understand correctly, there are two main concerns: speed and
portability. I think that they are both not that terrible.

How about this:
* For IEEE-754 hardware, we implement decimal/binary conversions, and
define the exact behaviour of floats.
* For non-IEEE-754 hardware, we keep the current method of relying on
the system libraries.

About speed, perhaps it's not such a big problem, since decimal/binary
conversions are usually related to I/O, and this is relatively slow
anyway. I think that usually a program does a relatively few
decimal/binary conversions.
About portability, I think (from a small research I just made) that
S90 supports IEEE-754. This leaves VAX and cray users, which will have
to live with a non-perfect floating-point behaviour.

If I am correct, it will let 99.9% of the users get a deterministic
floating-point behaviour, where eval(repr(f)) == f and
repr(1.1)=='1.1', with a speed penalty they won't notice.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

Sounds okay, except that I think that for some folks (e.g. numeric
Python users) I/O speed *does* matter, as their matrices are very
large, and their disks and networks are very fast.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Noam Raphael

Noam Raphael added the comment:

If I were in that situation I would prefer to store the binary
representation. But if someone really needs to store decimal floats,
we can add a method fast_repr which always calculates 17 decimal
digits.

Decimal to binary conversion, in any case, shouldn't be slower than it
is now, since on Gnu it is done anyway, and I don't think that our
implementation should be much slower.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

 If I were in that situation I would prefer to store the binary
 representation. But if someone really needs to store decimal floats,
 we can add a method fast_repr which always calculates 17 decimal
 digits.

They can just use %.17g % x

 Decimal to binary conversion, in any case, shouldn't be slower than it
 is now, since on Gnu it is done anyway, and I don't think that our
 implementation should be much slower.

Sure.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-11 Thread Guido van Rossum

Guido van Rossum added the comment:

I've tracked my problem to the GCC optimizer. The default optimizer
setting is -O3. When I edit the Makefile to change this to -O1 or -O0
and recompile (only) doubledigits.c, repr(1e5) starts returning
'10.0' again. -O2 behaves the same as -O3.

Now, don't immediately start pointing fingers to the optimizer; it's
quite possible that the code depends on behavior that the C standard
doesn't actually guarantee. The code looks quote complicated.

This is with GCC 4.0.3.

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1580] Use shorter float repr when possible

2007-12-10 Thread Guido van Rossum

Guido van Rossum added the comment:

I like this; but I don't have time for a complete thourough review. 
Maybe Tim can lend a hand?

If Tim has no time, I propose that if it works correctly without leaks
on at least Windows, OSX and Linux, we check it in, and worry about more
review later.

Crys, can you test this on Windows and see if it needs any project file
updates?  I'll check OSX.  Linux works and I see no leaks.

--
assignee:  - tiran
keywords: +patch
nosy: +gvanrossum, tim_one, tiran

__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue1580
__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   >