Re: isnanl, printf, and non-IEEE values

2012-06-19 Thread Paul Eggert
On 06/19/2012 05:51 AM, John Spencer wrote:
> #ifdef SYS_USES_LD80
> x = get_valid_ld80_or_zero(x);
> #endif
> 
> OSLT. why should i care ? go figure it out yourself.

As I understand it, it was your assertion that code like
GNU od's could be written portably, using standard C.
I was merely trying to check that assertion.  As the above
snippet does not appear to be portable standard C, we have not
been able to verify the assertion in question.

> you found one guy that wants to write floats into a file

Lots of people write floats into a file.  I've done it
myself.  That "one guy" was just one example.  If people
didn't need to read floats from files, 'od' wouldn't have
options to do exactly that.

>> I was talking about printf.
> not it that paragraph. 

My paragraph was talking about GNU od, which uses
snprintf to do the conversion.  Presumably snprintf uses
the same float-conversion machinery that printf does, so
if snprintf has undefined behavior on floats, then
GNU od will have undefined behavior as well, as will
any other program that reads floats from files and
prints them with printf or similar functions.

> seriously, with your attitude

On the contrary, we've all been remarkably polite,
considering the circumstances.  Are you accustomed
to winning arguments via vituperation?  Perhaps that
strategy works well in musl circles, but it's
counterproductive here.



Re: isnanl, printf, and non-IEEE values

2012-06-19 Thread Jim Meyering
Eric Blake wrote:
> On 06/19/2012 06:51 AM, John Spencer wrote:
>> *sigh*.
>> talking to you guys is like talking to a wall.
>
> Please don't swear.  This is a publicly archived list, and you are
> coming across rather offensive.  A positive attitude is more likely to
> foster cooperation than berating others.
>
>>> There's no force here.  The process is entirely voluntary.
>>>
>> ah perfect then, so please educate me where i can find the hidden switch
>> to tell GNULIB:
>>
>> "NO I DONT WANT YOUR F** BROKEN REPLACEMENT FUNCTIONS, THAT EVEN
>> FAIL TO COMPILE WITH AN #ERROR, BECAUSE ITS AUTHORS ARE MORONS THAT
>> DISABLED THE EXISTING PORTABLE FALLBACK CODE" ?
>
> This is your complaint about 'closein', and I think we are making
> progress here.
>
> The problem is that the existing fallback code is not perfect - if you
> would help us come up with a portable replacement that works
> _efficiently_, then we could remove the #error everywhere.  In the
> meantime, the #error continues to serve its purpose - it has let us
> improve both DragonFly and musl (thanks to recent commits adding
> stdioext functions) and gnulib (to use those functions instead of poking
> at musl FILE* internals or falling back to the #error), and the end
> result will be that programs released against the latest version of
> gnulib should now compile without error on musl, with no further effort
> on your part, and without the speed penalty of the fallback code.

Thank you, Eric, for countering the cursing and abusive tone
with calmly-delivered tips that should help John.



Re: isnanl, printf, and non-IEEE values

2012-06-19 Thread Eric Blake
On 06/19/2012 06:51 AM, John Spencer wrote:
> *sigh*.
> talking to you guys is like talking to a wall.

Please don't swear.  This is a publicly archived list, and you are
coming across rather offensive.  A positive attitude is more likely to
foster cooperation than berating others.

>> There's no force here.  The process is entirely voluntary.
>>
> ah perfect then, so please educate me where i can find the hidden switch
> to tell GNULIB:
> 
> "NO I DONT WANT YOUR F** BROKEN REPLACEMENT FUNCTIONS, THAT EVEN
> FAIL TO COMPILE WITH AN #ERROR, BECAUSE ITS AUTHORS ARE MORONS THAT
> DISABLED THE EXISTING PORTABLE FALLBACK CODE" ?

This is your complaint about 'closein', and I think we are making
progress here.

The problem is that the existing fallback code is not perfect - if you
would help us come up with a portable replacement that works
_efficiently_, then we could remove the #error everywhere.  In the
meantime, the #error continues to serve its purpose - it has let us
improve both DragonFly and musl (thanks to recent commits adding
stdioext functions) and gnulib (to use those functions instead of poking
at musl FILE* internals or falling back to the #error), and the end
result will be that programs released against the latest version of
gnulib should now compile without error on musl, with no further effort
on your part, and without the speed penalty of the fallback code.

> 
> or
> 
> "YES, I AM FINE WITH A PRINTF THAT BAILS OUT ON INVALID BIT PATTERNS,
> BECAUSE THAT'S UB AND HOW C WORKS.

What's wrong if gnulib decides to throw another layer on top of your
libc?  Yes, the binaries compiled that way will be larger than they
could have been, but that's the choice of the program author to use its
own format routines instead of deferring to libc.

But if you absolutely want to guarantee that programs compiled against
musl will pick up the musl printf instead of the gnulib replacement,
then install a config.site that primes the autoconf cache to override
the decisions and claim that printf matches the features gnulib was
looking for even without running the configure tests.  For example,
priming gl_cv_func_printf_infinite_long_double=yes in your config.site
file would skip gnulib's m4 test of LD80 behavior, and as long as you
then ignore the testsuite failures, the rest of your program will
happily use the UB of passing an out-of-range LD80 number to musl's
printf instead of gnulib's replacement.

> you know, reading 20.000 lines of ugly generated autoconf script code is
> pretty though; i can't spot the "voluntary" option myself.

No need to go through that many lines; instead, look at the gl_cv_*
cache values in config.cache for the shorter list of the variables that
you might want to prime, or read the source code (the .m4 files) instead
of the generated output code (the configure file).

-- 
Eric Blake   [email protected]+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: isnanl, printf, and non-IEEE values

2012-06-19 Thread John Spencer

*sigh*.
talking to you guys is like talking to a wall.
seriously, with your attitude you will render GNU a complete joke in 
less than a decade, even moreso than it is already.

unfortunately that will also affect the FSF.

On 06/19/2012 04:37 AM, Paul Eggert wrote:

On 06/18/2012 06:27 PM, John Spencer wrote:

easy: add a check for the invalid LD bit representations

How does one do that, exactly?  I thought the
whole point of the proposed exercise was that code must
be portable to any standard C implementation.
So, where's the portable code to do what you're
proposing?


#ifdef SYS_USES_LD80
x = get_valid_ld80_or_zero(x);
#endif

OSLT. why should i care ? go figure it out yourself. it is your (GNU) 
freaking octaldump program that is broken (unless it uses a specially 
patched libc), not mine.



who uses LD80 values in files anyway ?

Lots of people.  A quick Google search found
"I have a binary file of long double values created with fwrite in C"

I'm sure one can find more examples.

wow. in the depths of the internets you found one guy that wants to 
write floats into a file. amazing.

and from that you judge that lots of people do.

...

anyway this is irrelevant, it is valid for a program to desire reading 
and writing long double values into a value,
and it is valid for a program to desire that invalid bit patterns don't 
crash a program.


*it is not valid though to expect that the LIBC will do "The Right 
Thing" magically*


if you want this behaviour in a program, code it into the program, for 
fucks sake.

neither me nor paul were talking about printf().

No, I was talking about printf.

not it that paragraph.

you're forcing your printf replacement on implementations

There's no force here.  The process is entirely voluntary.

ah perfect then, so please educate me where i can find the hidden switch 
to tell GNULIB:


"NO I DONT WANT YOUR F** BROKEN REPLACEMENT FUNCTIONS, THAT EVEN 
FAIL TO COMPILE WITH AN #ERROR, BECAUSE ITS AUTHORS ARE MORONS THAT 
DISABLED THE EXISTING PORTABLE FALLBACK CODE" ?


or

"YES, I AM FINE WITH A PRINTF THAT BAILS OUT ON INVALID BIT PATTERNS, 
BECAUSE THAT'S UB AND HOW C WORKS. DO NOT IMPOSE YOUR SHITTY REPLACEMENT 
ON ME" ?


you know, reading 20.000 lines of ugly generated autoconf script code is 
pretty though; i can't spot the "voluntary" option myself.







Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Paul Eggert
On 06/18/2012 06:27 PM, John Spencer wrote:
> easy: add a check for the invalid LD bit representations

How does one do that, exactly?  I thought the
whole point of the proposed exercise was that code must
be portable to any standard C implementation.
So, where's the portable code to do what you're
proposing?

> who uses LD80 values in files anyway ? 

Lots of people.  A quick Google search found
"I have a binary file of long double values created with fwrite in C"

I'm sure one can find more examples.

> neither me nor paul were talking about printf(). 

No, I was talking about printf.

> you're forcing your printf replacement on implementations

There's no force here.  The process is entirely voluntary.



Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread John Spencer

On 06/19/2012 04:00 AM, Eric Blake wrote:

On 06/18/2012 07:27 PM, John Spencer wrote:


It's nice to be
able to print a floating point value retrieved from a file,
even if the file got messed up somehow,

ack.

That argues that printf() should be robust to _everything_ you throw at
it, even bit patterns that are not produced by arithmetic.



neither me nor paul were talking about printf().
and apparently the standard differs from your interpretation.
those bit patterns are UB and therefore can be left unhandled.

remember: gnulib checks for "POSIX compliant printf", then produces said 
UB and declares a perfectly conformant implementation broken.



and to have that
work reliably.  I'm willing to have my programs be
0.0001% slower if I can get that extra reliability.


maybe you, but not me.
and i definitely don't want 2 different printf implementations in my
binary because of this silly cornercase.

You don't need 2 different printf implementations, just one that
robustly handles all possible input.


no, i don't need that. if the guys that wrote the C99 spec were fine 
with UB, i'm fine with it as well.



who uses LD80 values in files anyway ?

Who cares?  POSIX requires od to support it, and that should be enough
for libc to support it.


if POSIX requires od to support it, od should handle it.
where did you get the impression that the libc should do this for you ?
that's completely bogus.





Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Eric Blake
On 06/18/2012 07:27 PM, John Spencer wrote:

>> It's nice to be
>> able to print a floating point value retrieved from a file,
>> even if the file got messed up somehow,
> ack.

That argues that printf() should be robust to _everything_ you throw at
it, even bit patterns that are not produced by arithmetic.

>> and to have that
>> work reliably.  I'm willing to have my programs be
>> 0.0001% slower if I can get that extra reliability.
>>
> maybe you, but not me.
> and i definitely don't want 2 different printf implementations in my
> binary because of this silly cornercase.

You don't need 2 different printf implementations, just one that
robustly handles all possible input.

> who uses LD80 values in files anyway ?

Who cares?  POSIX requires od to support it, and that should be enough
for libc to support it.

-- 
Eric Blake   [email protected]+1-919-301-3266
Libvirt virtualization library http://libvirt.org





signature.asc
Description: OpenPGP digital signature


Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread John Spencer

On 06/19/2012 01:15 AM, Paul Eggert wrote:

On 06/18/2012 05:07 AM, John Spencer wrote:

it was already discussed that "GNU od" does,
but it can be easily fixed there.

I don't see how.
easy: add a check for the invalid LD bit representations in ldtoastr() 
in gnulibs ftoastr.c.

since it was written by you, this shouldn't be a problem.
(this is the code od uses to handle the long double).

And even if it could be, why not
just leave things as they are?
because you're forcing your printf replacement on implementations that 
don't handle invalid long double representations via the gnulib tests 
just to have this one odd cornercase handled. fix the one cornercase, 
and let the libc in question decide for itself if it wants to handle 
this UB.

this test needs to go away.

It's nice to be
able to print a floating point value retrieved from a file,
even if the file got messed up somehow,

ack.

and to have that
work reliably.  I'm willing to have my programs be
0.0001% slower if I can get that extra reliability.


maybe you, but not me.
and i definitely don't want 2 different printf implementations in my 
binary because of this silly cornercase.

who uses LD80 values in files anyway ?





Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Paul Eggert
On 06/18/2012 05:07 AM, John Spencer wrote:
> it was already discussed that "GNU od" does,
> but it can be easily fixed there.

I don't see how.  And even if it could be, why not
just leave things as they are?  It's nice to be
able to print a floating point value retrieved from a file,
even if the file got messed up somehow, and to have that
work reliably.  I'm willing to have my programs be
0.0001% slower if I can get that extra reliability.



Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Ben Pfaff
Rich Felker  writes:

> On Mon, Jun 18, 2012 at 10:21:34AM -0700, Ben Pfaff wrote:
>> Bruno Haible  writes:
>> 
>> > In theory you would be right that data should be validated at the 
>> > boundaries
>> > of the program, that is, when they are read from outside sources. But no
>> > program I know of does this for unconstrained floating-point numbers.
>> 
>> That's an interesting point.  GNU PSPP reads unconstrained
>> "double"s from SPSS data files without validating them[*], since
>
> This is not a problem. As long as you're assuming IEEE, all 64-bit
> patterns are valid long double values. The issue only occurs for long
> double, a type that's very different in both arithmetic properties and
> representation between systems. On typical systems, it has padding
> bits (which might or might not be required to be all-zero) as well as
> bit combinations that are completely invalid.

OK, thanks for the information.  Never mind, then.



Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Rich Felker
On Mon, Jun 18, 2012 at 10:21:34AM -0700, Ben Pfaff wrote:
> Bruno Haible  writes:
> 
> > In theory you would be right that data should be validated at the boundaries
> > of the program, that is, when they are read from outside sources. But no
> > program I know of does this for unconstrained floating-point numbers.
> 
> That's an interesting point.  GNU PSPP reads unconstrained
> "double"s from SPSS data files without validating them[*], since

This is not a problem. As long as you're assuming IEEE, all 64-bit
patterns are valid long double values. The issue only occurs for long
double, a type that's very different in both arithmetic properties and
representation between systems. On typical systems, it has padding
bits (which might or might not be required to be all-zero) as well as
bit combinations that are completely invalid.

> it never occurred to me that this was a bad idea.  A function to
> validate a floating-point number would therefore be useful in
> PSPP.

It's only as useful as long doubles, which is, well, not very useful
to most people.

Rich



Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Ben Pfaff
Bruno Haible  writes:

> In theory you would be right that data should be validated at the boundaries
> of the program, that is, when they are read from outside sources. But no
> program I know of does this for unconstrained floating-point numbers.

That's an interesting point.  GNU PSPP reads unconstrained
"double"s from SPSS data files without validating them[*], since
it never occurred to me that this was a bad idea.  A function to
validate a floating-point number would therefore be useful in
PSPP.

[*] Except in the case where it detects that the data file uses a
nonnative floating point format such as one of the old IBM or
DEC floating-point formats, in which case it does a format
conversion.



Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread John Spencer

On 06/18/2012 03:06 PM, Bruno Haible wrote:

John Spencer wrote:

its not the job of the libc to make broken code happy.

i dont think its a good idea to make thousands of correct programs slower,
just that GNU guys dont have to fix one program.

Following your argumentation, we don't need
   - W^X protection in the x86 hardware,
   - address space layout randomization in the kernel,
   - support for -fstack-protector, -fmudflag, and -fbounds-check in gcc
 and libc,
   - double-free checks in libc,
   - function pointer encryption in libc.


where is the relation ? you are comparing apples and oranges.

--JS




Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread Bruno Haible
John Spencer wrote:
> its not the job of the libc to make broken code happy.
> 
> i dont think its a good idea to make thousands of correct programs slower,
> just that GNU guys dont have to fix one program.

Following your argumentation, we don't need
  - W^X protection in the x86 hardware,
  - address space layout randomization in the kernel,
  - support for -fstack-protector, -fmudflag, and -fbounds-check in gcc
and libc,
  - double-free checks in libc,
  - function pointer encryption in libc.

We don't need all this, because broken programs are easily identified
and all other programs are correct, right?

Read .

Bruno




Re: isnanl, printf, and non-IEEE values

2012-06-18 Thread John Spencer

onSun, 17 Jun 2012 16:00:13 -0700 Paul Eggert wrote


On 06/17/2012 03:41 PM, Rich Felker wrote:

 No program I know of reads long double directly from binary files.



'od -tfL' does.


it was already discussed that "GNU od" does, but it can be easily fixed there.

its not the job of the libc to make broken code happy.

i dont think its a good idea to make thousands of correct programs slower, just 
that GNU guys dont have to fix one program.



 I'm sure there are others.  It's pretty
common to save binary data into files and restore it later.


there's no problem in saving a correct float into a file and restore it.
floats generated in a correct program will have a correct float representation, 
and thus not invoke UB when restoring the value.

--JS






Re: isnanl, printf, and non-IEEE values

2012-06-17 Thread Bruno Haible
Rich Felker wrote:
> > > So isnanl is expected to be slower in every program that's using it
> > > for legitimate arithmetic purposes
> > 
> > Yes. But it will not be slower by much. The CPUs have an instruction for
> > 'fpclassify'; you just need to pass the right bitmask to that instruction.
> 
> Are you sure that's faster than avoiding loading the value into the
> fpu at all and doing integer arithmetic/bit tests? I have my doubts.

Maybe integer arithmetic is faster than the 'fpclassify' instruction.
Either way, an isnanl() implementation can be written that is not
terribly expensive.

Bruno




Re: isnanl, printf, and non-IEEE values

2012-06-17 Thread Paul Eggert
On 06/17/2012 03:41 PM, Rich Felker wrote:
> No program I know of reads long double directly from binary files.

'od -tfL' does.  I'm sure there are others.  It's pretty
common to save binary data into files and restore it later.



Re: isnanl, printf, and non-IEEE values

2012-06-17 Thread Rich Felker
On Sun, Jun 17, 2012 at 11:33:45PM +0200, Bruno Haible wrote:
> The other justification for handling these representations was brought up
> by Jim in the long thread that surrounded this glibc bug:
> 
> 
> See also the summary in
> .
> 
> Namely, glibc was not only producing wrong output. glibc *crashed* when
> you passed some floating-point value outside the IEEE range. Jim's

This happened to our implementation too at one point, and it
definitely has me considering just supporting the damn things... I'm
not all that opposed to it, but my main interest in having the
discussion over gnulib is that I believe gnulib should not demand that
implementations make this choice and consider failure to do so grounds
for replacing printf (which could have lots of unwanted side effects).

> argument: A program can produce floating-point values from a multitude of
> sources; one way is to read them binary-encoded from files.
> 
> In theory you would be right that data should be validated at the boundaries
> of the program, that is, when they are read from outside sources. But no
> program I know of does this for unconstrained floating-point numbers. Hence,

No program I know of reads long double directly from binary files.
Doing so would be inadvisible for many reasons - the existance of
invalid bit patterns, the inherent non-portability of the format, and
the possibility of information leakage from storing a data type with
padding bits (which ld80 has on x86 and x86_64) directly to disk.

> the easiest way to avoid programs from crashing or producing senseless
> output is to treat non-IEEE values like NaNs. This is what we're doing in
> gnulib.
> 
> > So isnanl is expected to be slower in every program that's using it
> > for legitimate arithmetic purposes
> 
> Yes. But it will not be slower by much. The CPUs have an instruction for
> 'fpclassify'; you just need to pass the right bitmask to that instruction.

Are you sure that's faster than avoiding loading the value into the
fpu at all and doing integer arithmetic/bit tests? I have my doubts.

> > for the sake of one program's ease
> > of implementing a non-standard and mostly useless feature?
> 
> The ability to call printf on a 'long double' argument without risking a
> crash is a "mostly useless feature"? Some people see it differently.

I see coddling hypothetical incorrect programs that probably don't
even exist as a mostly-useless feature.

> > Attempting to replace "big" functions like printf should be avoided
> > unless absolutely necessary.
> 
> Have you seen how broken printf is on many systems?
> 

Please add glibc to the list:
http://sourceware.org/bugzilla/show_bug.cgi?id=6530

Another report of the issue is here (#1 under glibc bugs):
http://www.kernel.org/pub/linux/libs/uclibc/Glibc_vs_uClibc_Differences.txt

Rich



Re: isnanl, printf, and non-IEEE values

2012-06-17 Thread Bruno Haible
Rich Felker wrote:
> > > 2. Several tests for isnanl and printf long double support are
> > > invalid. They are generating invalid LD80 representations that cannot
> > > occur as values ("pseudo-denormal", for example) and testing that
> > > isnanl and printf treat these as NAN. Per the C standard, there is no
> > > need to handle these bit patterns (attempting to use them as floating
> > > point values results in UB); all it does is make isnanl() slightly
> > > slower and larger, so I'm reluctant to change our isnanl to match
> > > gnulib's expectations.

Eric Blake replied:
> > Actually, there IS a need to handle these representations.  The 'od'
> > program in coreutils is an example of where POSIX requires us to handle
> > ANY bit pattern as given in an arbitrary input file as ANY other type of
> > number, including long doubles.  And that means that all possible bit
> > patterns, even the invalid LD80 representations that cannot occur as a
> > result of arithmetic, CAN occur via memory aliasing, and we really do
> > desire to output those as NaN via the use of isnanl().

The other justification for handling these representations was brought up
by Jim in the long thread that surrounded this glibc bug:


See also the summary in
.

Namely, glibc was not only producing wrong output. glibc *crashed* when
you passed some floating-point value outside the IEEE range. Jim's
argument: A program can produce floating-point values from a multitude of
sources; one way is to read them binary-encoded from files.

In theory you would be right that data should be validated at the boundaries
of the program, that is, when they are read from outside sources. But no
program I know of does this for unconstrained floating-point numbers. Hence,
the easiest way to avoid programs from crashing or producing senseless
output is to treat non-IEEE values like NaNs. This is what we're doing in
gnulib.

> So isnanl is expected to be slower in every program that's using it
> for legitimate arithmetic purposes

Yes. But it will not be slower by much. The CPUs have an instruction for
'fpclassify'; you just need to pass the right bitmask to that instruction.

> for the sake of one program's ease
> of implementing a non-standard and mostly useless feature?

The ability to call printf on a 'long double' argument without risking a
crash is a "mostly useless feature"? Some people see it differently.

> considering printf broken, and replacing printf
> because of this, is a big issue. Replacing printf is non-trivial

Sure. That's why gnulib does it only after the autoconf test has determined
that there really is a problem with the system's printf implementation.

> Attempting to replace "big" functions like printf should be avoided
> unless absolutely necessary.

Have you seen how broken printf is on many systems?


Bruno