subject:"Re\: Ada front\-end depends on signed overflow"

Re: Ada front-end depends on signed overflow

2005-06-09 Thread Paul Schlie

 From: Georg Bauhaus [EMAIL PROTECTED]
 Paul Schlie wrote:
 - How is it necessary or desirable to define that the result is undefined
   vs. being target defined?
 
 What does C say about how a target performs an instruction?
 And why shouldn't GCC take advantage of this?

- In essence I believe the difference is consistency. Where although a
  target implementation defined behavior seems no more clear, it implies
  that the semantics of an operation in question should directly result
  from it's target implementation, and any corresponding compile-time
  constant propagation optimizations should be consistent with their
  otherwise run-time computed counterparts, as otherwise they will be
  needlessly and arguably erroneously inconsistent; which an undefined
  behavior would seem to allow, without providing any tangible benefit.

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Paul Schlie

 From: Robert Dewar [EMAIL PROTECTED]
 Paul Schlie wrote:
 
 - yes, it certainly enables an implementation to generate more efficient
   code which has no required behavior; so in effect basically produce more
   efficient programs which don't reliably do anything in particular; which
   doesn't seem particularly useful?
 
 You keep saying this over and over, but it does not make it true. Once
 again, the whole idea of making certain constructs undefined, is to
 ensure that efficient code can be generated for well defined constructs.

- Can you give an example of an operation which may yield an undefined
  non-deterministic result which is reliably useful for anything?

 - Essentially yes; as FP is an approximate not absolute representation
   of a value, therefore seems reasonable to accept optimizations which
   may result in some least significant bits of ambiguity.
 
 Rubbish, this shows a real misunderstanding of floating-point. FP values
 are not approximations, they are well defined values in a system of
 arithmetic with precisely defined semantics, just as well defined as
 integer operations. Any compiler that followed your misguided ideas
 above would be a real menace and completely useless for any serious
 fpt work.

- I'm sorry I wasn't aware that C/C++ for example specified the bit exact
  representation and semantic requirements for FP. (possibly because it's
  not defined as being so?)

   What's silly, is claiming that such operations are bit exact when even
   something as basic as their representational base radix number systems
   isn't even defined by the standard, nor need necessarily be the same
   between different FP types; thereby an arbitrary value is never always
   guaranteed to exactly representable as an FP value in all
   implementations (therefore test for equivalence with an arbitrary value
   is equally ambiguous, as would any operations on that value, unless it is
   known that within a particular implementation that it's value and any
   resulting intermediate operation values are correspondingly precisely
   representable, which is both implementation and target specific, although
   hopefully constrained to be as closely approximated as possible within
   it's representational constraints.)

 As it is, the actual situation is that most serious fpt programmers
 find themselves in the same position you are with integer arithmetic.
 They often don't like the freedom given by the language to e.g. allow
 extra precision (although they tend to be efficiency hungry, so one
 doesn't know in practice that this is what they really want, since they
 want it without paying for it, and they don't know how much they would
 have to pay :-)

- Agreed, therefore because FP is inherently an imprecise representation,
  and bit exact equivalences between arbitrary real numbers and their
  representational form is not warranted, therefore should never be relied
  upon; therefore seems reasonable to enable optimizations which may alter
  the computed results as long as they are reasonably known to constrain
  the result's divergence to some few number least significant bits
  of precision. (as no arbitrary value is warranted to be representable,
  with the possible exception of some implementation/target specific
  whole number integer values, but who's overflow semantics are also
  correspondingly undefined.)

   Where integer operations are relied upon for state representations,
   which are in general must remain precisely and deterministically
   calculated, as otherwise catastrophic semantic divergences may result.
 
 Nonsense, losing the last bit in an FP value can be fatal to many algorithms.
 Indeed, some languages allow what seems to FP programmers to be too much
 freedom, but not for a moment can a compiler writer contemplate doing an
 optimization which is not allowed. For instance, in general replacing
 (a+b)+c by a+(b+c) is an absolute no-no in most languages.

- only if it's naively relied upon to be precise to some arbitrary
  precision, which as above is not warranted in general, so an algorithm's
  implementation should not assume it to be in general, as given in your
  example, neither operation is warranted to compute to an equivalent value
  in any two arbitrary implementations (although hopefully consistent within
  their respective implementations).

 - No, exactly the opposite, the definition of an order of evaluation
   eliminates ambiguities, it does not prohibit anything other than the
   compiler applying optimizations which would otherwise alter the meaning
   of the specified expression.
 
 No, the optimizations do not alter the meaning of any C expression. If the
 meaning is undefined, then

- yes I understand C/C++ etc. has chosen to define overflow and
  evaluation order (among a few other things) as being undefined.

 a) the programmer should not have written this rubbish.

- or the language need not have enabled a potentially well defined
  expression to be

RE: Ada front-end depends on signed overflow

2005-06-08 Thread Dave Korn

Original Message
From: Paul Schlie
Sent: 08 June 2005 14:40


 - Can you give an example of an operation which may yield an undefined
   non-deterministic result which is reliably useful for anything?


  Random number generation?



cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Paul Schlie

 From: Dave Korn [EMAIL PROTECTED]
 Original Message
 From: Paul Schlie
 Sent: 08 June 2005 14:40
  
 - Can you give an example of an operation which may yield an undefined
   non-deterministic result which is reliably useful for anything?
 
   Random number generation?

- which if algorithmically computed relies on deterministic semantics.
  as otherwise the pseudo random distribution the algorithm is relying
  on may break.

RE: Ada front-end depends on signed overflow

2005-06-08 Thread Dave Korn

Original Message
From: Paul Schlie
Sent: 08 June 2005 14:49

 From: Dave Korn [EMAIL PROTECTED]
 Original Message
 From: Paul Schlie
 Sent: 08 June 2005 14:40
 
 - Can you give an example of an operation which may yield an undefined
   non-deterministic result which is reliably useful for anything?
 
   Random number generation?
 
 - which if algorithmically computed relies on deterministic semantics.
   as otherwise the pseudo random distribution the algorithm is relying
   on may break.



  I didn't say Pseudo random number generation.  I said Random number
generation.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Robert Dewar

Paul Schlie wrote:

From: Dave Korn [EMAIL PROTECTED]
Original Message

From: Paul Schlie
Sent: 08 June 2005 

- Can you give an example of an operation which may yield an undefined
 non-deterministic result which is reliably useful for anything?

 Random number generation?

randomness has nothing whatever to do with non-determinisn. They
are completely different concepts.

But there are of course many examples.

THere are many examples in concurrent programming where non-determinism
is useful, and in set programming, arbitrary non-deterministic selection
from a set is fundamental.

But this is a complete red herring in this discussion

The reason that for example in Ada we say that
a+b means non-determinisitically either compute a then b, or
b then a, is not that it is useful for these results to be
different, but precisely because we expect NO reasonable
program to ever write an expression a+b in which the two
semantic meanings that are possible are different, and
we want the compiler to take advantage of this to generate
better code.

For example

  a + f(b)

we typically expect f(b) to be called first, even though
formally it might be the case that f(b) modifies a, so
this choice could have some effect from the formal
non-determinism of the semantics. Our actual attitude
is that if anyone writes code like this in which f(b)
modifies a, they are highly incompetent and we don't
care what happens.

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Robert Dewar


Dave Korn wrote:


  I didn't say Pseudo random number generation.  I said Random number
generation.


which once again has nothing whatever to do with non-determinism.

TO illustrate this, suppose I have a language which has sets of
integers. I have an operator ARB whose semantics is to select
an element of such a set non-deterministically. It is a fine
implementation to always return the smallest number in the
set (of course no legitimate program can rely on this artifact
of the implementation). It would not be at all fine to use
this implementation for the RAND operator that selects a
random element from the set.

Many people mix these concepts up, it always causes trouble.
I once remember a very senior member of the CS community
(I will keep the name to myself), during a discussion of
Ada semantics being dismayed at the overhead required for
non-deterministic selection of an open SELECT alternative,
since he assumed it meant that a random number generator
would have to be used :-)

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Robert Dewar


Paul Schlie wrote:


   What's silly, is claiming that such operations are bit exact when even
   something as basic as their representational base radix number systems
   isn't even defined by the standard, nor need necessarily be the same
   between different FP types; thereby an arbitrary value is never always
   guaranteed to exactly representable as an FP value in all
   implementations (therefore test for equivalence with an arbitrary value
   is equally ambiguous, as would any operations on that value, unless it is
   known that within a particular implementation that it's value and any
   resulting intermediate operation values are correspondingly precisely
   representable, which is both implementation and target specific, although
   hopefully constrained to be as closely approximated as possible within
   it's representational constraints.)


You are really just digging yourself into a hole here. It is clear
that you know very little about floating-point arithmetic. If you
are interested in learning, there are quite a lot of good references.
I would suggest Michael Overton's new book as a good starting point.


- Agreed, therefore because FP is inherently an imprecise representation,
  and bit exact equivalences between arbitrary real numbers and their
  representational form is not warranted, therefore should never be relied
  upon; therefore seems reasonable to enable optimizations which may alter
  the computed results as long as they are reasonably known to constrain
  the result's divergence to some few number least significant bits
  of precision. (as no arbitrary value is warranted to be representable,
  with the possible exception of some implementation/target specific
  whole number integer values, but who's overflow semantics are also
  correspondingly undefined.)


There is nothing imprecise about IEEE floating-point operations


- only if it's naively relied upon to be precise to some arbitrary
  precision, which as above is not warranted in general, so an algorithm's
  implementation should not assume it to be in general, as given in your
  example, neither operation is warranted to compute to an equivalent value
  in any two arbitrary implementations (although hopefully consistent within
  their respective implementations).


More complete nonsense. Of course we do not rely on fpt operations being
precise to arbitrary precision, we just expect well defined IEEE results
which are defined to the last bit, and all modern hardware provides this
capability.


- yes I understand C/C++ etc. has chosen to define overflow and
  evaluation order (among a few other things) as being undefined.



a) the programmer should not have written this rubbish.



- or the language need not have enabled a potentially well defined
  expression to be turned into rubbish by enabling an implementation
  to do things like arbitrarily evaluate interdependent sub-expressions
  in arbitrary orders, or not require an implementation to at least
  optimize expressions consistently with their target's native semantics.


Well it is clear that language designers generally disagree with you.
Are you saying they are all idiots and you know better, or are you
willing to try to learn why they disagree with you?


- agreed, an operation defined as being undefined enables an implementation
  to produce an arbitrary result (which therefore is reliably useless).


Distressing nonsense, sounds like you have learned nothing from this
thread. Well hopefully others have, but anyway, last contribution from
me, I think everything has been said that is useful.

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Lassi A . Tuura


- Can you give an example of an operation which may yield an undefined
  non-deterministic result which is reliably useful for anything?


Hm.  int foo (const char *x, int y) { return printf (x, y); }

Lassi
--
If you would know the value of money, go try to borrow some.
--Ben Franklin

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Gabriel Dos Reis

Robert Dewar [EMAIL PROTECTED] writes:

[...]

| You are really just digging yourself into a hole here. It is clear
| that you know very little about floating-point arithmetic.

[...]

| More complete nonsense.

[...]

| Are you saying they are all idiots and you know better, or are you
| willing to try to learn why they disagree with you?

[...]

| Distressing nonsense, sounds like you have learned nothing from this
| thread. Well hopefully others have, but anyway, last contribution from
| me, I think everything has been said that is useful.

Ahem.

-- Gaby

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Paul Schlie

 From: Robert Dewar [EMAIL PROTECTED]
 Date: Wed, 08 Jun 2005 10:16:23 -0400
 To: Paul Schlie [EMAIL PROTECTED]
 Cc: Florian Weimer [EMAIL PROTECTED], Andrew Pinski [EMAIL PROTECTED],
 GCC List gcc@gcc.gnu.org, [EMAIL PROTECTED]
 Subject: Re: Ada front-end depends on signed overflow

 You are really just digging yourself into a hole here. It is clear
 that you know very little about floating-point arithmetic. If you
 are interested in learning, there are quite a lot of good references.
 I would suggest Michael Overton's new book as a good starting point.
 
 - Agreed, therefore because FP is inherently an imprecise representation,
   and bit exact equivalences between arbitrary real numbers and their
   representational form is not warranted, therefore should never be relied
   upon; therefore seems reasonable to enable optimizations which may alter
   the computed results as long as they are reasonably known to constrain
   the result's divergence to some few number least significant bits
   of precision. (as no arbitrary value is warranted to be representable,
   with the possible exception of some implementation/target specific
   whole number integer values, but who's overflow semantics are also
   correspondingly undefined.)
 
 There is nothing imprecise about IEEE floating-point operations

- agreed, however nor is it mandated by most language specifications,
  so seemingly irrelevant.

 - only if it's naively relied upon to be precise to some arbitrary
   precision, which as above is not warranted in general, so an algorithm's
   implementation should not assume it to be in general, as given in your
   example, neither operation is warranted to compute to an equivalent value
   in any two arbitrary implementations (although hopefully consistent within
   their respective implementations).
 
 More complete nonsense. Of course we do not rely on fpt operations being
 precise to arbitrary precision, we just expect well defined IEEE results
 which are defined to the last bit, and all modern hardware provides this
 capability.

- as above (actually most, if inclusive of all processors in production,
  don't directly implement fully compliant IEEE FP math, although many
  closely approximate it, or simply provide no FP support at all; and
  as an aside: far more processors implement wrapping signed overflow
  semantics than provide IEEE fp support, as most do not differentiate
  between basic signed and unsigned 2's complement integer operations,
  so if expectations are based on likelihood of an arbitrary production
  processor supporting one vs. the other, one would expect wrapped overflow
  with a high likely-hood, and fully compliant IEEE support with a less
  likely-hood).

 a) the programmer should not have written this rubbish.
 
 - or the language need not have enabled a potentially well defined
   expression to be turned into rubbish by enabling an implementation
   to do things like arbitrarily evaluate interdependent sub-expressions
   in arbitrary orders, or not require an implementation to at least
   optimize expressions consistently with their target's native semantics.
 
 Well it is clear that language designers generally disagree with you.
 Are you saying they are all idiots and you know better, or are you
 willing to try to learn why they disagree with you?

- I'm saying/implying nothing of that sort; as I happen to believe that
  the reason things are the way they are for the most part, is that although
  most knew better, the committees needed to politically accommodate
  varied implementation practices and assumptions, as otherwise would end
  up forcing some companies to invest a great deal of time and money to
  either re-implement their existing compilers, processor implementations,
  or application programs to accommodate a stricter set of specifications.
  (which most commercial organizations would lobby strongly against), which
  is one of the things that Java had the luxury of being able to somewhat
  side step.
 
 - agreed, an operation defined as being undefined enables an implementation
   to produce an arbitrary result (which therefore is reliably useless).
 
 Distressing nonsense, sounds like you have learned nothing from this
 thread. Well hopefully others have, but anyway, last contribution from
 me, I think everything has been said that is useful.

- I would have if someone could provide a concrete example of an undefined
  behavior which produces a reliably useful/predictable result.

RE: Ada front-end depends on signed overflow

2005-06-08 Thread Dave Korn

Original Message
From: Paul Schlie
Sent: 08 June 2005 15:53

 From: Robert Dewar

 There is nothing imprecise about IEEE floating-point operations

 - agreed, however nor is it mandated by most language specifications,
   so seemingly irrelevant.

I refer you to Annex F (normative) IEC 60559 floating-point arithmetic of the 
C language spec.  Normative implies a mandate, does it not?

F.1 Introduction
1 This annex specifies C language support for the IEC 60559 floating-point 
standard. The IEC 60559 floating-point standard is specifically Binary 
floating-point arithmetic for microprocessor systems, second edition (IEC 
60559:1989), previously designated IEC 559:1989 and as IEEE Standard for Binary 
Floating-Point Arithmetic (ANSI/IEEE 7541985). IEEE Standard for 
Radix-Independent Floating-Point Arithmetic (ANSI/IEEE 8541987) generalizes 
the binary standard to remove dependencies on radix and word length. IEC 60559 
generally refers to the floating-point standard, as in IEC 60559 operation, IEC 
60559 format, etc. An implementation that defines
__STDC_IEC_559__ conforms to the specifications in this annex. Where a binding 
between the C language and IEC 60559 is indicated, the IEC 60559-specified 
behavior is adopted by reference, unless stated otherwise.

 - as above (actually most, if inclusive of all processors in production,
   don't directly implement fully compliant IEEE FP math, although many
   closely approximate it, or simply provide no FP support at all; 

  Pretty much every single ix86 and rs6000, and many m68 arch CPUs provide 
last-bit-exact IEEE implementations in hardware these days.  Your statement is 
simply factually incorrect.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Michael Veksler







Paul Schlie wrote on 08/06/2005 17:53:04:


 - I would have if someone could provide a concrete example of an
undefined
   behavior which produces a reliably useful/predictable result.


Well this is a simple hackery quiz, which is irrelevant to GCC.


  1: int a, b;
  2: int f() {   return b++; }
  3: int main(int argc)
  4: {
  5:   b= argc;
  6:   a= b + f();  /* a==b*2  or  a==b*2+1 */
  7:   a= a/2;  /* a=b */
  8:   return a;
  9: }

If one would claim that a is totally unconstrained at line 6, then this
example will be invalid. In that case, I can give a more restricted
example, where 'a' is computed speculatively and is discarded
in exactly the same cases when it is undefined.

Oh, well here it is.
  1: int a, b, c;
  2: int f() {  return a ? b++ : b; }
  3: int main()
  4: {
  5:   scanf(%d %d, a, b);
  6:   c= b + f();   /* C is undefined if a != 0*/
  7:   if (a)  c = b;
  8:   return c;
  9: }

This example is predictable. I argue that it may be also
useful performance-wise to do speculative computations.

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Bernd Schmidt


Paul Schlie wrote:

From: Robert Dewar [EMAIL PROTECTED]
You keep saying this over and over, but it does not make it true. Once
again, the whole idea of making certain constructs undefined, is to
ensure that efficient code can be generated for well defined constructs.



- Can you give an example of an operation which may yield an undefined
  non-deterministic result which is reliably useful for anything?


The simplest example is a shift operation, x  n, where the compiler 
may assume that the shift count is smaller than the width of the type. 
All sane machines agree on shift behaviour for 0 = n  width, but there 
are differences between machines for n = width.  Since this case is 
undefined by the language, it is ensured that shifts take only a single 
instruction on essentially all machines.



Bernd

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Joe Buck

On Wed, Jun 08, 2005 at 10:53:04AM -0400, Paul Schlie wrote:
  From: Robert Dewar [EMAIL PROTECTED]
  There is nothing imprecise about IEEE floating-point operations
 
 - agreed, however nor is it mandated by most language specifications,
   so seemingly irrelevant.

In real life, there are no longer any significant non-embedded
architectures out there that don't use IEEE floating point, so
it is a widely used practice to assume it and document the requirement.
The resulting programs might not work on a Vax, Cray, or IBM 370.
C'est la vie.

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Paul Schlie

 From: Joe Buck [EMAIL PROTECTED]
 On Wed, Jun 08, 2005 at 10:53:04AM -0400, Paul Schlie wrote:
 From: Robert Dewar [EMAIL PROTECTED]
 There is nothing imprecise about IEEE floating-point operations
 
 - agreed, however nor is it mandated by most language specifications,
   so seemingly irrelevant.
 
 In real life, there are no longer any significant non-embedded
 architectures out there that don't use IEEE floating point, so
 it is a widely used practice to assume it and document the requirement.
 The resulting programs might not work on a Vax, Cray, or IBM 370.
 C'est la vie.

- With the not so minor exception that IEEE strict semantics are basically
  counter productive for most real-time fp signal processing tasks, as
  simple saturation and consistent reciprocal semantics tend to be preferred
  as they lose precision gracefully, not catastrophically as sticky inf and
  nan semantics do at the limits of it's representation bounds; which is why
  fp signal processor architectures tend not to implement IEEE semantics,
  and arguably rely on it's imprecision in lieu of failure at it's bounds.
  (also most embedded processors do not have any FP support, and often
  typically benefit from looser soft implementations of IEEE, as strict
  bit-exact behavior is typically of less significance when all that may
  be occasionally required is just a bit more dynamic range than an fixed
  point representation my reasonably provide, and often gladly trade a few
  bits of precision for a less bulky implementations and never need to fool
  with nan or inf semantics).

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Paul Schlie

 From: Bernd Schmidt [EMAIL PROTECTED]
 Paul Schlie wrote:
 From: Robert Dewar [EMAIL PROTECTED]
 You keep saying this over and over, but it does not make it true. Once
 again, the whole idea of making certain constructs undefined, is to
 ensure that efficient code can be generated for well defined constructs.

 - Can you give an example of an operation which may yield an undefined
   non-deterministic result which is reliably useful for anything?

 The simplest example is a shift operation, x  n, where the compiler
 may assume that the shift count is smaller than the width of the type.
 All sane machines agree on shift behaviour for 0 = n  width, but there
 are differences between machines for n = width.  Since this case is
 undefined by the language, it is ensured that shifts take only a single
 instruction on essentially all machines.

- How is it necessary or desirable to define that the result is undefined
  vs. being target defined? As undefined seems to be presumed to give an
  implementation the license to presume it will yield any specific value
  it chooses (such as 0 for example, and presume that value in subsequent
  optimizations), where the target implementation may actually yield
  something reliably different?

  For example given that an implementation knows how a runtime value
  shift value will be treated, it seems much more reasonable to treat
  compile time constant shift values consistently, as otherwise constant
  variable values will have different different semantics effects than
  non-constant variable values may? (Which doesn't seem like a good thing,
  or strictly necessary to enable an arbitrary implementation to efficiently
  implement consistent bit-shift semantics although it's behavior beyond
  implementation-independent bounds may differ between differing
  implementations?)

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Paul Schlie

 From: Michael Veksler [EMAIL PROTECTED]
 Paul Schlie wrote on 08/06/2005 17:53:04:
 - I would have if someone could provide a concrete example of an
   undefined behavior which produces a reliably useful/predictable result.
 
 Well this is a simple hackery quiz, which is irrelevant to GCC.
 
   1: int a, b;
   2: int f() {   return b++; }
   3: int main(int argc)
   4: {
   5:   b= argc;
   6:   a= b + f();  /* a==b*2  or  a==b*2+1 */
   7:   a= a/2;  /* a=b */
   8:   return a;
   9: }
 
 If one would claim that a is totally unconstrained at line 6, then this
 example will be invalid. In that case, I can give a more restricted
 example, where 'a' is computed speculatively and is discarded
 in exactly the same cases when it is undefined.

- unless I misunderstand, it actually is undefined if the value of argc
  may be as large as the largest representable positive int value, if
  both signed int overflows and evaluation order are undefined/unspecified.

  (although confess to missing your point, as it's not clear how an
  undefined behavior is productively contributing to the observation that
  a == argc if agc++ doesn't overflow or evaluated l-r; as opposed to
  observing that if the evaluation order were fixed l-r and not undefined,
  then the optimization may be derived?)

 Oh, well here it is.
   1: int a, b, c;
   2: int f() {  return a ? b++ : b; }
   3: int main()
   4: {
   5:   scanf(%d %d, a, b);
   6:   c= b + f();   /* C is undefined if a != 0*/
   7:   if (a)  c = b;
   8:   return c;
   9: }
 
 This example is predictable. I argue that it may be also
 useful performance-wise to do speculative computations.

- I believe it's subject to the same problem?

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Michael Veksler







Paul Schlie [EMAIL PROTECTED] wrote on 08/06/2005 21:16:46:

  From: Michael Veksler [EMAIL PROTECTED]
  Paul Schlie wrote on 08/06/2005 17:53:04:
  - I would have if someone could provide a concrete example of an
undefined behavior which produces a reliably useful/predictable
result.
 
  Well this is a simple hackery quiz, which is irrelevant to GCC.
 
1: int a, b;
2: int f() {   return b++; }
3: int main(int argc)
4: {
5:   b= argc;
6:   a= b + f();  /* a==b*2  or  a==b*2+1 */
7:   a= a/2;  /* a=b */
8:   return a;
9: }
 
  If one would claim that a is totally unconstrained at line 6, then this
  example will be invalid. In that case, I can give a more restricted
  example, where 'a' is computed speculatively and is discarded
  in exactly the same cases when it is undefined.

 - unless I misunderstand, it actually is undefined if the value of argc
   may be as large as the largest representable positive int value, if
   both signed int overflows and evaluation order are
undefined/unspecified.


It is undefined due to evaluation order.
Is it:
  tmp= b; // b is fetched first.
  a= tmp + b++; // then f() is called and added
or is it:
  tmp= b++;   // f() is called first.
  a= b + tmp; // then b is fetched and added

Please note that this has nothing to do with overflow.
I argue that although 'a' is undefined due to
evaluation order, it may still be used as long as
its undefined-ness is constrained (which the std may
or may not guarantee).
Anyway, as I said, it is more of a quiz than a practical
question and example. You may come up with a bizarre
example where it makes some sense.


   (although confess to missing your point, as it's not clear how an
   undefined behavior is productively contributing to the observation that
   a == argc if agc++ doesn't overflow or evaluated l-r; as opposed to
   observing that if the evaluation order were fixed l-r and not
undefined,
   then the optimization may be derived?)


Now it is me who missed the above point. What is 1-r ?

  Oh, well here it is.
1: int a, b, c;
2: int f() {  return a ? b++ : b; }
3: int main()
4: {
5:   scanf(%d %d, a, b);
6:   c= b + f();   /* C is undefined if a != 0*/
7:   if (a)  c = b;
8:   return c;
9: }
 
  This example is predictable. I argue that it may be also
  useful performance-wise to do speculative computations.

 - I believe it's subject to the same problem?

No, here also the result is undefined due to evaluation
order. My point in this example is that you may operate
on 'c' which is defined for some inputs and undefined
for others, and still get an observably consistent results.
If you know that the result may be undefined only when you
do not intend to use it, then you don't care about it
(if guaranteed not to get SIGBUS).

As other have noted, the effects of constraining the
compiler to platform defined behavior, you lose
performance.
For example:
  double a, b;
  int f();
  int main()
  {
int i;
for (i=0 ; i  100 ; ++i)
   b += (double)f() + sin(a);
return foo(b);
  }
Let's assume that on a given platform the FPU
can calculate sin(a) in parallel to f().

The compiler cannot prove that f() does not
modify a, because f() is not given here or
because it is intractable.

If, calls to f() and sin(a) take 10 ns each, then
calculating them in parallel will save 50% of the
total run-time. To do that, the compiler will have
to change the order of function calls.

You may argue that even a x100 speed up is not worth
an undefined behavior. I will argue that in this case
most programmers will avoid modifying 'a' inside f()
anyway (due to maintainability) and will avoid the
undefined result.

I think that normative code should be penalized for
the possibility of bizarre code. This bizarre code
should be fixed anyway to avoid maintenance
nightmares.

Re: Ada front-end depends on signed overflow

2005-06-08 Thread Georg Bauhaus


Paul Schlie wrote:


- How is it necessary or desirable to define that the result is undefined
  vs. being target defined?


What does C say about how a target performs an instruction?
And why shouldn't GCC take advantage of this?

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Robert Dewar


Paul Schlie wrote:


- Agreed, I would classify any expression as being ambiguous if any of
  it's operand values (or side effects) were sensitive to the allowable
  order of evaluation of it's remaining operands, but not otherwise.


But this predicate cannot be evaluated at compile time!


Now you seem to suggest that the optimizer should simply avoid
optimizing in such cases (where it matters).


- No, I simply assert that if an expression is unambiguous (assuming
  my definition above for the sake of discussion), then the compiler
  may choose to order the evaluation in any way it desires as long as
  it does not introduce an such an ambiguity by doing so.


But this predicate cannot be evaluated at compile time!


- I fully agree that if a complier does not maintain records of the
  program state which a function may alter or be dependant on, as
  would be required to determine if any resulting operand/side-effect
  interdependences may exist upon it's subsequent use as an operand
  within a an expression itself; then the compiler would have no choice
  but to maintain it's relative order of evaluation as hypothetically
  specified, as it may otherwise introduce an ambiguity.


Fine, but then you are designing a different language from C. It is
fine to argue this language point in the context of langage design,
but this is irrelevant to the discussion of the implementation of
C, since the language is already defined, and the design decision
is contrary to what you want. Any C programmer who programs with
the conceptions you suggest is simply not a competent C programmer.

Note also the critical word in your above paragraph: may. That's
the whole point, the compiler can't tell, and if it has to make
worst case assumptions, the impact on code efficiency is
significant. SO it is no problem for the compiler to maintain
records ..., but it is not good enough (please reread my examples
in the previous message!)


  Although I believe I appreciate the relative complexity this introduces
  to both the compiler, and well as requirements imposed on pre-compiled
  libraries, etc., I don't believe that it justifies a language definition
  legitimizing the specification of otherwise non-deterministic programs.


Fine, as I say, an argument for some other forum.


- As you've specified the operations as distinct statements, I would argue
  that such an optimization would only be legitimate if the result were
  known to produce the same result as if the statements were evaluated in
  sequence as specified by the standard (which of course would be target
  specific). 


You can argue this all you like, but it is just a plain wrong
argument for C which is a language defined by the ANSI/ISO
standard, not by Paul Schlie.


  d = (a + b) / 8;

  would be ambiguous if the complier were able to restructure evaluation
  of expression in any way which may alter it's resulting effective result
  for a given target, As a program which has non-deterministic behavior
  doesn't seem very useful


That's a judgment with which I, many others, and the designers of
C disagree, so too bad for you!



Now it is legitimate to argue about how much quality is hurt, and
whether the resulting non-determinisim is worth the effeciency hit.



- Or rather is non-determinisim ever legitimately acceptable? (as I can't
  honestly think of a single instance were it would be, except if it may
  only result in the lost of a measurably few bits of fp precision, which
  are imprecise representations to begin with. but not otherwise?) 


If you can't appreciate the argument on the other side, you can't
very effectively argue your own position. Most language designers
are quite ready to accept non-determinate behavior in peculiar cases
to ensure that the common cases can be compiled efficiently. The basic
conclusion in design of such languages (which includes C, C++, Ada,
Fortran, Pasal, PL/1, etc) is that no reasonable programmer writes
the kind of side effects that cause trouble, so why cripple efficiency
for all reasonable programmers.

Really the only language in common use that follows your line of
thinking is Java, and that is a case where a very concious
decision is made to sacrifice efficiency for reproducibility
and portability of peculiar cases.


But overall do agree with your earlier statement, that each language has
made a choice, for better or worse.


Yes, of course, and the job of the compiler writer is to implement
the languages with the choice it made, not second guess it.

Interesting postscript. Even APL as originally designed had undefined
order of operations. However, early implementations were naive
interpretors which not only associated right to left (as required
by the language) but also executed in this order (which was not
required). Programmers got used to expecting this order and
relying on it (partly because it worked and most of them did
not even know it was wrong to rely on it, and partly because
people got confused

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Florian Weimer

* Robert Dewar:

 Defintiely not, integer overflow generates async traps on very few
 architectures. Certainly synchronous traps can be generated (e.g.
 with into on the x86).

Or the JO jump instruction.  Modern CPUs choke on the INTO
instruction.

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Robert Dewar


Florian Weimer wrote:

* Robert Dewar:



Defintiely not, integer overflow generates async traps on very few
architectures. Certainly synchronous traps can be generated (e.g.
with into on the x86).



Or the JO jump instruction.  Modern CPUs choke on the INTO
instruction.


Well INTO is still the best choice if you worry about code size.
Furthermore, it would be interesting to know what the actual
pragmatic effect of choke is, by comparing:

1. no checks
2. current software checks
3. into style checks
4. checks using jo

on real applications (not an easy test to do unfortunately)

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Paul Schlie

 From: Robert Dewar [EMAIL PROTECTED]
 Paul Schlie wrote:
 Fine, but then you are designing a different language from C.

- I'm not attempting to design a language, but just defend the statement
  that I made earlier; which was in effect that I contest the assertion
  that undefined evaluation semantics enable compilers to generate more
  efficient useful code by enabling them to arbitrarily destructively alter
  evaluation order of interdependent sub-expressions, and/or base the
  optimizations on behaviors which are not representative of their target
  machines.

  Because I simply observe that since an undefined behavior may also be
  non-deterministic even within a single program, it can't be relied upon;
  therefore enabling a compiler to produce garbage more efficiency is
  seems basically worthless, and actually even dangerous when the compiler
  can't even warn about resulting potentially non-deterministic ambiguities
  because it can't differentiate between garbage and reliably deterministic
  useful code, as it considers them equivalently legitimate.

  (With an exception being FP optimization, as FP is itself based
   only on the approximate not absolute representation of values.)

 - Agreed, I would classify any expression as being ambiguous if any of
   it's operand values (or side effects) were sensitive to the allowable
   order of evaluation of it's remaining operands, but not otherwise.
 
 But this predicate cannot be evaluated at compile time!

- Why not? The compiler should be able to statically determine if an
  expression's operands are interdependent, by determining if any of
  its operand's sub-expressions are themselves dependant on a variable
  value potentially modifiable by any of the other operand's sub-
  expressions. (Which is basically the same constraint imposed when
  rescheduling instructions, as an assignment can not be moved passed a
  reference to the same variable value, without potentially corrupting
  the effective semantics of the specified program, but may freely
  re-schedule assignments and references to distinct variable values
  past each other without any restrictions safely.)

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Florian Weimer

* Paul Schlie:

 - I'm not attempting to design a language, but just defend the statement
   that I made earlier; which was in effect that I contest the assertion
   that undefined evaluation semantics enable compilers to generate more
   efficient useful code by enabling them to arbitrarily destructively alter
   evaluation order of interdependent sub-expressions, and/or base the
   optimizations on behaviors which are not representative of their target
   machines.

But the assertion is trivially true.  If you impose fewer constraints
on an implementation by leaving some cases undefined, it always has
got more choices when generating code, and some choices might yield
better code.  So code generation never gets worse.

Whether an implementation should exercise the liberties granted by the
standard in a particular case is a different question, and has to be
decided on a case-by-case basis.

   (With an exception being FP optimization, as FP is itself based
only on the approximate not absolute representation of values.)

FP has well-defined semantics, and its absolutely required for
compilers to implement them correctly because otherwise, a lot of
real-world code will break.

Actually, this is a very interesting example.  You don't care about
proper floating point arithmetic and are willing to sacrifice obvious
behavior for a speed or code size gain.  Others feel the same about
signed integer arithmetic.

 - Agreed, I would classify any expression as being ambiguous if any of
   it's operand values (or side effects) were sensitive to the allowable
   order of evaluation of it's remaining operands, but not otherwise.
 
 But this predicate cannot be evaluated at compile time!

 - Why not?

In general, it's undecidable.

   The compiler should be able to statically determine if an
   expression's operands are interdependent, by determining if any of
   its operand's sub-expressions are themselves dependant on a variable
   value potentially modifiable by any of the other operand's sub-
   expressions.

Phrased this way, you make a lot of code illegal.  I doubt this is
feasible.

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Florian Weimer

* Paul Schlie:

 But the assertion is trivially true.  If you impose fewer constraints
 on an implementation by leaving some cases undefined, it always has
 got more choices when generating code, and some choices might yield
 better code.  So code generation never gets worse.

 - yes, it certainly enables an implementation to generate more efficient
   code which has no required behavior; so in effect basically produce more
   efficient programs which don't reliably do anything in particular; which
   doesn't seem particularly useful?

The quality of an implementation can't be judged only based on its
conformance to the standard, but this does not mean that the
implementation gets better if you introduce additional constraints
which the standard doesn't impose.

Some people want faster code, others want better debugging
information.  A few people only want optimizations which do not change
anything which is pracitcally observable but execution time (which is
a contradiction), and so on.

 - Essentially yes; as FP is an approximate not absolute representation
   of a value, therefore seems reasonable to accept optimizations which
   may result in some least significant bits of ambiguity.

But the same is true for C's integers, they do not behave like the
real thing.  Actually, without this discrepancy, we wouldn't have to
worry about overflow semantics, which once was the topic of this
thread!

   The compiler should be able to statically determine if an
   expression's operands are interdependent, by determining if any of
   its operand's sub-expressions are themselves dependant on a variable
   value potentially modifiable by any of the other operand's sub-
   expressions.
 
 Phrased this way, you make a lot of code illegal.  I doubt this is
 feasible.

 - No, exactly the opposite, the definition of an order of evaluation
   eliminates ambiguities, it does not prohibit anything other than the
   compiler applying optimizations which would otherwise alter the meaning
   of the specified expression.

Ah, so you want to prescribe the evaluation order and allow reordering
under the as-if rule.  This wasn't clear to me, sorry.

It shouldn't be too hard to implement this (especially if your order
matches the Java order), so you could create a switch to fit your
needs.  I don't think it should be enabled by default because it
encourages developers to write non-portable code which breaks when
compiled with older GCC version, and it inevitably introduces a
performance regression on some targets.

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Robert Dewar


 * Paul Schlie:


 (With an exception being FP optimization, as FP is itself based
  only on the approximate not absolute representation of values.)


Floating-point arithmetic is not simply some inaccurate representation
of real arithmetic. It can be used this way by the programmer, but in
fact fpt operations have very well defined semantics, and compiler
writers have to be very careful not to intefere with these semantics
beyond the level permitted by the language. Certainly the above quoted
attitude would be pretty deadly if held by a compiler optimizer
writer!

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Robert Dewar


Paul Schlie wrote:


- yes, it certainly enables an implementation to generate more efficient
  code which has no required behavior; so in effect basically produce more
  efficient programs which don't reliably do anything in particular; which
  doesn't seem particularly useful?


You keep saying this over and over, but it does not make it true. Once
again, the whole idea of making certain constructs undefined, is to
ensure that efficient code can be generated for well defined constructs.


- Essentially yes; as FP is an approximate not absolute representation
  of a value, therefore seems reasonable to accept optimizations which
  may result in some least significant bits of ambiguity.


Rubbish, this shows a real misunderstanding of floating-point. FP values
are not approximations, they are well defined values in a system of
arithmetic with precisely defined semantics, just as well defined as
integer operations. Any compiler that followed your misguided ideas
above would be a real menace and completely useless for any serious
fpt work.

As it is, the actual situation is that most serious fpt programmers
find themselves in the same position you are with integer arithmetic.
They often don't like the freedom given by the language to e.g. allow
extra precision (although they tend to be efficiency hungry, so one
doesn't know in practice that this is what they really want, since they
want it without paying for it, and they don't know how much they would
have to pay :-)


  Where integer operations are relied upon for state representations,
  which are in general must remain precisely and deterministically
  calculated, as otherwise catastrophic semantic divergences may result.


Right, and please if you want to write integer operations, you must ensure
that you write only defined operations. If you write a+b and it overflows,
then you have written a junk C program, and you have no right to expect
anything whatsoever from the result. This is just another case of writing
a bug in your program, and consequently getting results you don't expect.

By the way, I know we are hammering this stuff (and Paul) a bit continuously
here, but these kind of misconceptions are very common among programmers who
do not understand as much as they should about language design and compilers.
I find I have to spend quite a bit of time in a graduate compiler course to
make sure everyone understands what undefined semantics are all about.


  (i.e. a single lsb divergence in an address calculation is not acceptable
   although an similar divergence in a FP value is likely harmless.)


Nonsense, losing the last bit in an FP value can be fatal to many algorithms.
Indeed, some languages allow what seems to FP programmers to be too much
freedom, but not for a moment can a compiler writer contemplate doing an
optimization which is not allowed. For instance, in general replacing
(a+b)+c by a+(b+c) is an absolute no-no in most languages.


- No, exactly the opposite, the definition of an order of evaluation
  eliminates ambiguities, it does not prohibit anything other than the
  compiler applying optimizations which would otherwise alter the meaning
  of the specified expression.


No, the optimizations do not alter the meaning of any C expression. If the
meaning is undefined, then

a) the programmer should not have written this rubbish

b) any optimization leaves the semantics undefined, and hence unchanged.

Furthermore, the optimizations are not about undefined expressions at all,
they are about generating efficient code for cases where the expression
has a well defined value, but the compiler cannot prove an as-if relation
true if the notion of undefined expressions is not present.

Re: Ada front-end depends on signed overflow

2005-06-07 Thread Joe Buck

On Tue, Jun 07, 2005 at 05:49:54PM -0400, Robert Dewar wrote:
  * Paul Schlie:
 
  (With an exception being FP optimization, as FP is itself based
   only on the approximate not absolute representation of values.)
 
 Floating-point arithmetic is not simply some inaccurate representation
 of real arithmetic. It can be used this way by the programmer, but in
 fact fpt operations have very well defined semantics, and compiler
 writers have to be very careful not to intefere with these semantics
 beyond the level permitted by the language. Certainly the above quoted
 attitude would be pretty deadly if held by a compiler optimizer
 writer!

Exactly.  I have written fixed-point packages as well as expression
manipulation packages that are based on the exact behavior of IEEE
floating point.  I have had running battles in the past (not recently)
with people who think that GCC should warn whenever == is applied to float
or double expressions.

There are some faults that we just have to live with (like the
uncontrolled extra precision on the x86, depending on whether a temporary
is spilled to memory or not), but programs can and do depend on the fact
that certain values and computations are represented precisely by floating
point arithmetic.

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Segher Boessenkool


There's also a fair amount of code whih relies on -1 ==
(int)0x.

Or is there any truly portable and efficient way to convert a sequence
of bytes (in big-endian order) to a signed integer?


Of course there is.  Assuming no padding bits:


int conv(unsigned char *c)
{
unsigned int i, u, hibit;

hibit = ~0U;
hibit ^= (hibit  1);

u = 0;
for (i = 0; i  sizeof u; i++)
u = (u  8) + c[i];

if ((u  hibit) == 0U)
return u;

u -= hibit;

if ((unsigned int)-1 == (hibit | 1U))
return -(int)u;

return (int)u - (int)(~hibit) - ((unsigned int)-1  1U);
}


which generates


_conv:
li r2,4
li r0,0
mtctr r2
L11:
lbz r2,0(r3)
slwi r0,r0,8
addi r3,r3,1
add r0,r0,r2
bdnz L11
mr r3,r0
blr


with GCC 3.3 on Darwin, and


.conv:
li 9,4
li 0,0
li 11,0
mtctr 9
.p2align 4,,15
.L2:
lbzx 9,3,11
slwi 0,0,8
nop
nop
addi 11,11,1
add 0,0,9
nop
nop
rldicl 0,0,0,32
nop
nop
nop
nop
bdnz .L2
extsw 3,0
nop
nop
nop
nop
blr


with a GCC-4.1.0 snapshot on powerpc64-unknown-linux-gnu (lots of
inefficiencies here, but nothing to do with the conversion itself).

Sorry, I couldn't test it on a ones' complement or sign-magnitude
machine -- just trust me it works (or embarrass me in public, if
a bug sneaked in while converting this to C) ;-)


Segher

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Eric Botcazou

 Once again, have you actually examined how awtul the code we
 generate now is?

Yes, I have.  Indeed not pretty, but suppose that we managed to cut the 
overhead in half, would that make -gnato really more attractive?

 Well of course that's just a plain bug, should be addressed as such.
 Obviously no one is using -ftrapv, so it will expose lots of bugs
 I would guess.

Yes, that's my impression too.

 Clearly the status quo is entirely unacceptable, so what's your
 recommendation of how we generate at least vaguely respectable
 code for overflow checking

Tough question, but I'm almost certain of something: since we need to recover 
from overflow checks, we have to expose them to the EH machinery, which 
nowadays means in the GENERIC tree IL.

From that, we have 2 alternatives: synchronous or asynchronous exceptions.
The current implementation is synchronous, as we explicitly raise exceptions 
in the code by calling a routine.  I guess you're pushing for asynchronous 
exceptions since this is probably the only efficient approach, i.e. we rely 
on the hardware to trap and try to recover from that.

If so, currently tree_could_trap_p will return true for all arithmetic 
operations whose type is TYPE_TRAP_SIGNED, with

#define TYPE_TRAP_SIGNED(NODE) \
  (flag_trapv  ! TYPE_UNSIGNED (NODE))

That seems far too broad.  We could instead flag expressions on a case-by-case 
basis, but I guess that would put too much burden on the tree optimizers.  So 
a compromise could be to use a specific bit for TYPE_TRAP_SIGNED and instruct 
Gigi to use the TYPE_TRAP_SIGNED-variant of a given type, when overflow 
checks are requested for an operation.

Then we would need to implement the overflow-aware instruction patterns for 
every architecture.  AFAICS only Alpha has (some of) them.  And fix the RTL 
optimizers in the process.

-- 
Eric Botcazou

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Richard Guenther

On 6/6/05, Segher Boessenkool [EMAIL PROTECTED] wrote:
  There's also a fair amount of code whih relies on -1 ==
  (int)0x.
 
  Or is there any truly portable and efficient way to convert a sequence
  of bytes (in big-endian order) to a signed integer?
 
 Of course there is.  Assuming no padding bits:

[snip complicated stuff]

Better use a union for the (final) conversion, i.e

int conv(unsigned char *c)
{
unsigned int i;
union {
unsigned int u;
int i;
} u;

u.u = 0;
for (i = 0; i  sizeof u; i++)
  u.u = (u.u  8) + c[i];

return u.i;
}

or even (if you can determine native byte-order and size at compile time)

int conv(unsigned char *c)
{
   union {
  unsigned char c[4];
  int i;
   } x;
   int i;
   for (int i=0; i4; ++i)
  x.c[3-i] = c[i];
   return x.i;
}

which generates only slightly worse code than above.

Richard.

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Segher Boessenkool


Better use a union for the (final) conversion, i.e

int conv(unsigned char *c)
{
unsigned int i;
union {
unsigned int u;
int i;
} u;

u.u = 0;
for (i = 0; i  sizeof u; i++)
  u.u = (u.u  8) + c[i];

return u.i;
}


This is not portable, though; accessing a union member other than
the member last stored into is unspecified behaviour (see J.1 and
6.2.6.1).

This is allowed (and defined behaviour) as a GCC extension, though.


Segher

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Paul Schlie

 From: Andrew Pinski [EMAIL PROTECTED]
 No they should be using -ftrapv instead which traps on overflow and then
 make sure they are not trapping when testing.
 
 - why? what language or who's code/target ever expects such a behavior?
 Everyone's who writes C/C++ should know that overflow of signed is undefined.
 
 Now in Java it is defined, which is the reason why -fwrapv exists in the
 place since GCC has a Java compiler.
 
 I think you need to go back in the archives and read the disscusions about
 when -fwrapv was added and see why it is not turned on by default for C.
 http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00850.html
 http://gcc.gnu.org/ml/gcc-patches/2003-03/msg02126.html
 http://gcc.gnu.org/ml/gcc-patches/2003-03/msg01727.html

Thank again, upon fully reviewing the threads I still conclude:

- C/C++ defines integer overflow as undefined because it's a target
  specific behavior, just as dereferencing a NULL is (although a large
  majority of targets factually do wrap overflow, and don't terminally
  trap NULL dereferences; so GCC's got it backwards in both cases).

- So technically as such semantics are undefined, attempting to track
  and identify such ambiguities is helpful; however the compiler should
  always optimize based on the true semantics of the target, which is
  what the undefined semantics truly enable (as pretending a target's
  semantics are different than the optimization assumptions, or forcing
  post-fact run-time trapping semantics, are both useless and potentially
  worse, inefficient and/or erroneous otherwise).

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Richard Guenther

On 6/6/05, Segher Boessenkool [EMAIL PROTECTED] wrote:
  Better use a union for the (final) conversion, i.e
 
  int conv(unsigned char *c)
  {
  unsigned int i;
  union {
  unsigned int u;
  int i;
  } u;
 
  u.u = 0;
  for (i = 0; i  sizeof u; i++)
u.u = (u.u  8) + c[i];
 
  return u.i;
  }
 
 This is not portable, though; accessing a union member other than
 the member last stored into is unspecified behaviour (see J.1 and
 6.2.6.1).
 
 This is allowed (and defined behaviour) as a GCC extension, though.

I guess this invokes well-defined behavior on any sane implementation,
as otherwise treatment of a pointer to such union would be different to
that of a pointer to a member of such union.  Aka you would get undefined
behavior once you read bytes from a file and access them as different
types.  Also this technique is cited to circumvent type aliasing issues
for f.i. doing bit-twiddling of floats on its integer representation.

But I guess following the standard, you are right :(

Richard.

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Robert Dewar


Paul Schlie wrote:


- So technically as such semantics are undefined, attempting to track
  and identify such ambiguities is helpful; however the compiler should
  always optimize based on the true semantics of the target, which is
  what the undefined semantics truly enable (as pretending a target's
  semantics are different than the optimization assumptions, or forcing
  post-fact run-time trapping semantics, are both useless and potentially
  worse, inefficient and/or erroneous otherwise).


The first part of this is contentious, but arguable certainly (what is
useful behavior). There is certainly no requirement that the semantics
should match that of the target, especially since that's ill-defined
anyway (for targets that have many different kinds of arithemetic
instructions).

The second part is wrong, it is clear that there are cases where
the quality of code can be improved by really taking advantage of the
undefinedness of integer overflow.

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Paul Schlie

 From: Robert Dewar [EMAIL PROTECTED]
 Paul Schlie wrote:
 
 - So technically as such semantics are undefined, attempting to track
   and identify such ambiguities is helpful; however the compiler should
   always optimize based on the true semantics of the target, which is
   what the undefined semantics truly enable (as pretending a target's
   semantics are different than the optimization assumptions, or forcing
   post-fact run-time trapping semantics, are both useless and potentially
   worse, inefficient and/or erroneous otherwise).
 
 The first part of this is contentious, but arguable certainly (what is
 useful behavior). There is certainly no requirement that the semantics
 should match that of the target, especially since that's ill-defined
 anyway (for targets that have many different kinds of arithemetic
 instructions).

- I don't mean to contest the standard which specifies the behavior is
  undefined (regardless of how useless I perceive that to be), but merely
  observe that in fact as most targets do implement 2's complement modulo
  2^N integer arithmetic, and given that overflow behavior is undefined,
  it makes makes no sense to presume otherwise (as such a behavior is both
  fully compliant and factually typical of most, if not near all, targets).

 The second part is wrong, it is clear that there are cases where
 the quality of code can be improved by really taking advantage of the
 undefinedness of integer overflow.

- As above; but to whom is it useful to compute an undefined result
  more efficiently, especially if the premise of an optimization is not
  factually consistent with the target's behavior (which will surely
  result in an incorrectly predicted, therefore likely computationally
  ambiguous/useless behavior)?

  Similar arguments has been given in support an undefined order of
  evaluation; which is absurd, as the specification of a semantic order
  of evaluation only constrains the evaluation of expressions which would
  otherwise be ambiguous, as expressions which are insensitive to their
  order of evaluation may always be evaluated in any order regardless of
  a specified semantic order of evaluation and yield the same result; so
  in effect, defining an order of evaluation only disambiguates expression
  evaluation, and does not constrain the optimization of otherwise
  unambiguous expressions.

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Robert Dewar


Paul Schlie wrote:


  Similar arguments has been given in support an undefined order of
  evaluation; which is absurd, as the specification of a semantic order
  of evaluation only constrains the evaluation of expressions which would
  otherwise be ambiguous, as expressions which are insensitive to their
  order of evaluation may always be evaluated in any order regardless of
  a specified semantic order of evaluation and yield the same result; so
  in effect, defining an order of evaluation only disambiguates expression
  evaluation, and does not constrain the optimization of otherwise
  unambiguous expressions.


I think perhaps you have not really looked at the issues that are raised
in optimizing compilers. Let me address this misconception first, similar
considerations apply to the overflow case.

The point is that at compile time you cannot tell what expressions are
ambiguous. I use quotes here since of course there is no ambiguity
involved:

  suppose you write  (a * f(b)) * (c * g(d))

where f(b) modifies c and g(d) modifies a. You would call this ambiguous
but in fact the semantics is undefined, so it is not ambiguous at all,
it has quite unambiguous semantics, namely undefined. The notion of
non-deterministic behavior is of course quite familiar, and what
undefined means here is that the behavior is non-deterministic from
among the set of all possible behaviors.

Now you seem to suggest that the optimizer should simply avoid
optimizing in such cases (where it matters). But the whole point
is that of course at compile time in general you can't tell whether
f(b) and g(d) have (horrible) side effects of modifying global
variables a and c. If we don't allow undefined behavior, the optimizer
would have to assume the worst, and refrain from optimizing the above
expression just in case some idiot had written side effects. The decision
of the standards committee is to make the side effect case undefined
PRECISELY so that the optimizer can go ahead and choose the optimal
order of evaluation without worrying about the side effect case.

Going back to the overflow case, consider:

   a = (some expression the compiler can tell is positive)
   b = (some expression the compiler can tell is positive)
   c = a + b;
   d = c / 8;

Here the compiler can do a right shift for the last statement. This
would of course be wrong in the general case (assuming c and d are int),
since the right shift does not work right if c is negative. But the
compiler is allowed to assume that a+b does not overflow and hence
the result is positive, and hence the shift is fine.

SO you see that guaranteeing twos complement wrap can hurt the quality
of the code in a case like this.

Now it is legitimate to argue about how much quality is hurt, and
whether the resulting non-determinisim is worth the effeciency hit.
But to conduct this argumnnt, you have to start off with an understanding
that there really is a trade-off and it is not clear what the decision
should be (C makes one decision, Java another, and Ada a third). The
three languages also differ in the handling of a+b

C says that this is undefined if there are side effects that cause what you
call ambiguity.

Java specifies left to right evaluation (evaluate a before b always).

Ada specifies that the result is non-deterministic, you either evaluate
a before b or b before a (not some mixture of the two), and there may
be (at most) two possible results.

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Robert Dewar


Eric Botcazou wrote:

Once again, have you actually examined how awtul the code we
generate now is?



Yes, I have.  Indeed not pretty, but suppose that we managed to cut the 
overhead in half, would that make -gnato really more attractive?


Yes, it would definitely make the difference, given the figures
we have seen.


From that, we have 2 alternatives: synchronous or asynchronous exceptions.
The current implementation is synchronous, as we explicitly raise exceptions 
in the code by calling a routine.  I guess you're pushing for asynchronous 
exceptions since this is probably the only efficient approach, i.e. we rely 
on the hardware to trap and try to recover from that.


Defintiely not, integer overflow generates async traps on very few
architectures. Certainly synchronous traps can be generated (e.g.
with into on the x86).

Re: Ada front-end depends on signed overflow

2005-06-06 Thread Paul Schlie

 From: Robert Dewar [EMAIL PROTECTED]
 Paul Schlie wrote:
 
   Similar arguments has been given in support an undefined order of
   evaluation; which is absurd, as the specification of a semantic order
   of evaluation only constrains the evaluation of expressions which would
   otherwise be ambiguous, as expressions which are insensitive to their
   order of evaluation may always be evaluated in any order regardless of
   a specified semantic order of evaluation and yield the same result; so
   in effect, defining an order of evaluation only disambiguates expression
   evaluation, and does not constrain the optimization of otherwise
   unambiguous expressions.
 
 I think perhaps you have not really looked at the issues that are raised
 in optimizing compilers. Let me address this misconception first, similar
 considerations apply to the overflow case.
 
 The point is that at compile time you cannot tell what expressions are
 ambiguous. I use quotes here since of course there is no ambiguity
 involved:
 
suppose you write  (a * f(b)) * (c * g(d))
 
 where f(b) modifies c and g(d) modifies a. You would call this ambiguous
 but in fact the semantics is undefined, so it is not ambiguous at all,
 it has quite unambiguous semantics, namely undefined. The notion of
 non-deterministic behavior is of course quite familiar, and what
 undefined means here is that the behavior is non-deterministic from
 among the set of all possible behaviors.

- Agreed, I would classify any expression as being ambiguous if any of
  it's operand values (or side effects) were sensitive to the allowable
  order of evaluation of it's remaining operands, but not otherwise.

 Now you seem to suggest that the optimizer should simply avoid
 optimizing in such cases (where it matters).

- No, I simply assert that if an expression is unambiguous (assuming
  my definition above for the sake of discussion), then the compiler
  may choose to order the evaluation in any way it desires as long as
  it does not introduce an such an ambiguity by doing so.

  But the whole point
 is that of course at compile time in general you can't tell whether
 f(b) and g(d) have (horrible) side effects of modifying global
 variables a and c. If we don't allow undefined behavior, the optimizer
 would have to assume the worst, and refrain from optimizing the above
 expression just in case some idiot had written side effects. The decision
 of the standards committee is to make the side effect case undefined
 PRECISELY so that the optimizer can go ahead and choose the optimal
 order of evaluation without worrying about the side effect case.

- I fully agree that if a complier does not maintain records of the
  program state which a function may alter or be dependant on, as
  would be required to determine if any resulting operand/side-effect
  interdependences may exist upon it's subsequent use as an operand
  within a an expression itself; then the compiler would have no choice
  but to maintain it's relative order of evaluation as hypothetically
  specified, as it may otherwise introduce an ambiguity.

  Although I believe I appreciate the relative complexity this introduces
  to both the compiler, and well as requirements imposed on pre-compiled
  libraries, etc., I don't believe that it justifies a language definition
  legitimizing the specification of otherwise non-deterministic programs.

 Going back to the overflow case, consider:
 
 a = (some expression the compiler can tell is positive)
 b = (some expression the compiler can tell is positive)
 c = a + b;
 d = c / 8;
 
 Here the compiler can do a right shift for the last statement. This
 would of course be wrong in the general case (assuming c and d are int),
 since the right shift does not work right if c is negative. But the
 compiler is allowed to assume that a+b does not overflow and hence
 the result is positive, and hence the shift is fine.
 
 SO you see that guaranteeing twos complement wrap can hurt the quality
 of the code in a case like this.

- As you've specified the operations as distinct statements, I would argue
  that such an optimization would only be legitimate if the result were
  known to produce the same result as if the statements were evaluated in
  sequence as specified by the standard (which of course would be target
  specific). Correspondingly I would assert that:

  d = (a + b) / 8;

  would be ambiguous if the complier were able to restructure evaluation
  of expression in any way which may alter it's resulting effective result
  for a given target, As a program which has non-deterministic behavior
  doesn't seem very useful, regardless of whether or not it's allowed by
  a standard or not. (Although concede that some optimizations have more
  benign worst-case effects than others, and may be reasonable if explicitly
  enabled (aka unsafe fp math); however as described above, this one seems
  likely more

Re: Ada front-end depends on signed overflow

2005-06-05 Thread Robert Dewar


Eric Botcazou wrote:

-ftrapv is not practically usable because (1) it generates awful code and
(2) it badly interacts with the RTL optimizers.


please before you say this compare it with the truly awful front end
code we generate, which for sure inteferes badly with the optimizers.



Right, the code generated by the front end is not pretty either, but at least 
the front-end knows what it is doing Ada-wise.  -ftrapv is so dumb at the 
moment that it emits checks for virtually anything.


No, it just emits it for signed operations. If we use -ftrapv, then we have
to make sure that operations labeled by the front end as not requiring an
overflow check are transformed into unsigned operations by gigi.


As for the interaction with the optimizers:

int foo(int a, int b)
{
  return a + b;
}

gcc -S -O -ftrapv on SPARC:
foo:
jmp %o7+8
 add%o0, %o1, %o0
.size   foo, .-foo
.ident  GCC: (GNU) 4.1.0 20050604 (experimental)

And I have a slightly more contrived example with the same problem at -O0!


I don't see that's so terrible, the jmp will be free in practice anyway
so I don't think you will find this slows things down.



I think we cannot use -ftrapv alone for Ada because it is too low-level.  Its 
general mechanism certainly can help (once it is fixed) but it must be driven 
by something more Ada-aware.


Again, please compare it with the simply awful mechanism used by the
front end right now.

Re: Ada front-end depends on signed overflow

2005-06-05 Thread Eric Botcazou

  Right, the code generated by the front end is not pretty either, but at
  least the front-end knows what it is doing Ada-wise.  -ftrapv is so dumb
  at the moment that it emits checks for virtually anything.

 No, it just emits it for signed operations.

Of course, it is not so dumb as to blatantly violate its specification.

 If we use -ftrapv, then we have to make sure that operations labeled by the
 front end as not requiring an overflow check are transformed into unsigned
 operations by gigi. 

I think it would be dangerous to take that path.

 I don't see that's so terrible, the jmp will be free in practice anyway
 so I don't think you will find this slows things down.

You missed the point; the overflow check has been optimized away by one of the 
RTL optimization passes.

-- 
Eric Botcazou

Re: Ada front-end depends on signed overflow

2005-06-05 Thread Robert Dewar


Eric Botcazou wrote:


Of course, it is not so dumb as to blatantly violate its specification.



If we use -ftrapv, then we have to make sure that operations labeled by the
front end as not requiring an overflow check are transformed into unsigned
operations by gigi. 



I think it would be dangerous to take that path.


It is unhelpful to make such statements without justification.
Once again, have you actually examined how awtul the code we
generate now is?




I don't see that's so terrible, the jmp will be free in practice anyway
so I don't think you will find this slows things down.



You missed the point; the overflow check has been optimized away by one of the 
RTL optimization passes.


Well of course that's just a plain bug, should be addressed as such.
Obviously no one is using -ftrapv, so it will expose lots of bugs
I would guess.

Clearly the status quo is entirely unacceptable, so what's your
recommendation of how we generate at least vaguely respectable
code for overflow checking

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Florian Weimer

* Andrew Pinski:

 The Ada front-end is still being missed compiled by VRP but VRP is doing
 the correct thing as the type is signed and overflow on signed is 
 undefined
 (-fwrapv is not turned on by default for Ada).

It probably makes sense to turn on -fwrapv for Ada because even
without -gnato, the behavior is not really undefined:

| The reason that we distinguish overflow checking from other kinds of
| range constraint checking is that a failure of an overflow check can
| generate an incorrect value, but cannot cause erroneous behavior.

http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gnat_ugn_unw/Run_002dTime-Checks.html

(Without -fwrapv, integer overflow is undefined, and subsequent range
checks can be optimized away, so that it might cause erroneous
behavior.)

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Florian Weimer

* Paul Schlie:

 (Without -fwrapv, integer overflow is undefined, and subsequent range
 checks can be optimized away, so that it might cause erroneous
 behavior.)

 - Since for all practical purposes most (if not all) target's use
   2's complement integer representations which naturally wrap, might
   it be simply best to presume that all do wrap by default, but allow
   -fnowrapv to disable it if ever required by the odd target/language?

Enabling -fwrapv disables quite a few optimizations on signed integer
types in C code.  OTOH, you should compile most real-world C code with
-fwrapv anyway.  See my security advisory on incorrect overflow
checking in C; this is a rather widespread issue, even in new code.

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Andrew Pinski

 
 * Paul Schlie:
 
  (Without -fwrapv, integer overflow is undefined, and subsequent range
  checks can be optimized away, so that it might cause erroneous
  behavior.)
 
  - Since for all practical purposes most (if not all) target's use
2's complement integer representations which naturally wrap, might
it be simply best to presume that all do wrap by default, but allow
-fnowrapv to disable it if ever required by the odd target/language?
 
 Enabling -fwrapv disables quite a few optimizations on signed integer
 types in C code.  OTOH, you should compile most real-world C code with
 -fwrapv anyway.  See my security advisory on incorrect overflow
 checking in C; this is a rather widespread issue, even in new code.

No they should be using -ftrapv instead which traps on overflow and then
make sure they are not trapping when testing.

Thanks,
Andrew Pinski

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Florian Weimer

* Paul Schlie:

 No they should be using -ftrapv instead which traps on overflow and then
 make sure they are not trapping when testing.

 - why? what language or who's code/target ever expects such a behavior?

I think Andrew wants programmers to fix their code, instead of
papering over problems. 8-)

All code has seen wide testing essentially with -fwrapv enabled
because in previous GCC version, -fwrapv had only a limited effect,
especially across multiple statmeents.  That's why I don't prefer the
-ftrapv approach, even though its technically the correct one.

It's a real pity that we have to trust so much C code which has been
written and reviewed by developers who aren't aware that signed
integer overflow is undefined.

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Joe Buck

On Fri, Jun 03, 2005 at 11:43:32AM -0400, Andrew Pinski wrote:
 Everyone's who writes C/C++ should know that overflow of signed is undefined.

In practice, however, this issue is commonly ignored, because people code
in a hurry, then test the behavior of the executable code, and if on all
platforms tested there are overflows, but the overflows wrap and as a
result the expected thing happens, the problem will not be noticed.

I'm sure there are plenty of production codes that assume signed integer
overflow wraps, or at least make the weaker assumption that in

 a = b + c + d;

where all variables are integers, if one of the intermediate terms
can't be represented in an integer, we still get the correct result
if the final result is representable in an integer.

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Arnaud Charlet

 Everyone's who writes C/C++ should know that overflow of signed is undefined.
 
 Now in Java it is defined, which is the reason why -fwrapv exists in the
 place since GCC has a Java compiler.

Right. I also believe that it is conceptually wrong to enable this for
Ada, so until this issue is analyzed by an Ada knowlegeable person (Richard
Kenner is looking at this issue), no such change should be made for Ada,
since this would likely be an incorrect fix, masking the real underlying
issue.

Arno

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Paul Schlie

 From: Joe Buck [EMAIL PROTECTED]
 On Fri, Jun 03, 2005 at 11:43:32AM -0400, Andrew Pinski wrote:
 Everyone's who writes C/C++ should know that overflow of signed is undefined.
 
 In practice, however, this issue is commonly ignored, because people code
 in a hurry, then test the behavior of the executable code, and if on all
 platforms tested there are overflows, but the overflows wrap and as a
 result the expected thing happens, the problem will not be noticed.

 I'm sure there are plenty of production codes that assume signed integer
 overflow wraps, or at least make the weaker assumption that in
 
 a = b + c + d;
 
 where all variables are integers, if one of the intermediate terms
 can't be represented in an integer, we still get the correct result
 if the final result is representable in an integer.

Agreed:

- which why it should be assumed to wrap, especially given that most/all
  current target, and likely most/all future targets do wrap (not trap)
  signed overflows. (so would expect the compiler to ideally reflect reality
  when analyzing code for optimization, not assume that undefined means
  ignore reality)?

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Daniel Berlin

On Fri, 2005-06-03 at 13:03 -0400, Paul Schlie wrote:
  From: Joe Buck [EMAIL PROTECTED]
  On Fri, Jun 03, 2005 at 11:43:32AM -0400, Andrew Pinski wrote:
  Everyone's who writes C/C++ should know that overflow of signed is 
  undefined.
  
  In practice, however, this issue is commonly ignored, because people code
  in a hurry, then test the behavior of the executable code, and if on all
  platforms tested there are overflows, but the overflows wrap and as a
  result the expected thing happens, the problem will not be noticed.
 
  I'm sure there are plenty of production codes that assume signed integer
  overflow wraps, or at least make the weaker assumption that in
  
  a = b + c + d;
  
  where all variables are integers, if one of the intermediate terms
  can't be represented in an integer, we still get the correct result
  if the final result is representable in an integer.
 
 Agreed:
 
 - which why it should be assumed to wrap, especially given that most/all
   current target, and likely most/all future targets do wrap (not trap)
   signed overflows. (so would expect the compiler to ideally reflect reality
   when analyzing code for optimization, not assume that undefined means
   ignore reality)?

In every case where practical, we build compilers to do what standards
say we should do, not what people do to abuse those standards.
There are places where the standard is murky, and we make hard decisions
based on what people do in the field.  This is not one of them.

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Geert Bosch



On Jun 3, 2005, at 09:02, Florian Weimer wrote:

It probably makes sense to turn on -fwrapv for Ada because even
without -gnato, the behavior is not really undefined:

| The reason that we distinguish overflow checking from other kinds of
| range constraint checking is that a failure of an overflow check can
| generate an incorrect value, but cannot cause erroneous behavior.

http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gnat_ugn_unw/Run_002dTime- 
Checks.html


(Without -fwrapv, integer overflow is undefined, and subsequent range
checks can be optimized away, so that it might cause erroneous
behavior.)



This is the strongest argument I have seen so far for defaulting
to either -ftrapv or -fwrapv.

Both the example Victor Skinner sent, and range checking in Ada
are cases of reasonable code where reasoning from undefined
behavior breaks code in unexpected ways. Essentially a class of
programs that checks for errors and would not otherwise be able
to cause a SIGSEGV or similar with previous versions of GCC,
would now be able to cause arbitrary mayhem.

Integer expressions such as X + Y - Z will give the correct
mathematical result, as long as that result is a representable
integer. Without wrap-around semantics however, one would have
to prove no intermediate result may overflow. Also, addition
and subtraction are no longer associative.

So, from a quality-of-implementation point of view, I believe
we should always default to -fwrapv.

For Ada, I propose we make the following changes:
  - by default, enable overflow checks using -ftrapv
(jay, we should be able to get rid of -gnato finally!)
  - with checks suppressed, use -fwrapv.

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Robert Dewar


Geert Bosch wrote:


For Ada, I propose we make the following changes:
  - by default, enable overflow checks using -ftrapv


This won't work, we generate plenty of cases of operations
where we definitely don't want a check, and we don't set the
flag, but currently the flag is ignored by gcc/gigi, we would
have to adjust gigi to switch to unsigned for cases where the
check overflow flagwas not set


(jay, we should be able to get rid of -gnato finally!)
  - with checks suppressed, use -fwrapv.


We really need to do this on an operation by operation basis,
since in Ada checks can be suppressed on an operation by
operation basis!

Re: Ada front-end depends on signed overflow

2005-06-03 Thread Eric Botcazou

 For Ada, I propose we make the following changes:
- by default, enable overflow checks using -ftrapv

-ftrapv is not practically usable because (1) it generates awful code and (2) 
it badly interacts with the RTL optimizers.

-- 
Eric Botcazou

54 matches

Mail list logo