Re: Ada front-end depends on signed overflow
From: Georg Bauhaus [EMAIL PROTECTED] Paul Schlie wrote: - How is it necessary or desirable to define that the result is undefined vs. being target defined? What does C say about how a target performs an instruction? And why shouldn't GCC take advantage of this? - In essence I believe the difference is consistency. Where although a target implementation defined behavior seems no more clear, it implies that the semantics of an operation in question should directly result from it's target implementation, and any corresponding compile-time constant propagation optimizations should be consistent with their otherwise run-time computed counterparts, as otherwise they will be needlessly and arguably erroneously inconsistent; which an undefined behavior would seem to allow, without providing any tangible benefit.
Re: Ada front-end depends on signed overflow
From: Robert Dewar [EMAIL PROTECTED] Paul Schlie wrote: - yes, it certainly enables an implementation to generate more efficient code which has no required behavior; so in effect basically produce more efficient programs which don't reliably do anything in particular; which doesn't seem particularly useful? You keep saying this over and over, but it does not make it true. Once again, the whole idea of making certain constructs undefined, is to ensure that efficient code can be generated for well defined constructs. - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? - Essentially yes; as FP is an approximate not absolute representation of a value, therefore seems reasonable to accept optimizations which may result in some least significant bits of ambiguity. Rubbish, this shows a real misunderstanding of floating-point. FP values are not approximations, they are well defined values in a system of arithmetic with precisely defined semantics, just as well defined as integer operations. Any compiler that followed your misguided ideas above would be a real menace and completely useless for any serious fpt work. - I'm sorry I wasn't aware that C/C++ for example specified the bit exact representation and semantic requirements for FP. (possibly because it's not defined as being so?) What's silly, is claiming that such operations are bit exact when even something as basic as their representational base radix number systems isn't even defined by the standard, nor need necessarily be the same between different FP types; thereby an arbitrary value is never always guaranteed to exactly representable as an FP value in all implementations (therefore test for equivalence with an arbitrary value is equally ambiguous, as would any operations on that value, unless it is known that within a particular implementation that it's value and any resulting intermediate operation values are correspondingly precisely representable, which is both implementation and target specific, although hopefully constrained to be as closely approximated as possible within it's representational constraints.) As it is, the actual situation is that most serious fpt programmers find themselves in the same position you are with integer arithmetic. They often don't like the freedom given by the language to e.g. allow extra precision (although they tend to be efficiency hungry, so one doesn't know in practice that this is what they really want, since they want it without paying for it, and they don't know how much they would have to pay :-) - Agreed, therefore because FP is inherently an imprecise representation, and bit exact equivalences between arbitrary real numbers and their representational form is not warranted, therefore should never be relied upon; therefore seems reasonable to enable optimizations which may alter the computed results as long as they are reasonably known to constrain the result's divergence to some few number least significant bits of precision. (as no arbitrary value is warranted to be representable, with the possible exception of some implementation/target specific whole number integer values, but who's overflow semantics are also correspondingly undefined.) Where integer operations are relied upon for state representations, which are in general must remain precisely and deterministically calculated, as otherwise catastrophic semantic divergences may result. Nonsense, losing the last bit in an FP value can be fatal to many algorithms. Indeed, some languages allow what seems to FP programmers to be too much freedom, but not for a moment can a compiler writer contemplate doing an optimization which is not allowed. For instance, in general replacing (a+b)+c by a+(b+c) is an absolute no-no in most languages. - only if it's naively relied upon to be precise to some arbitrary precision, which as above is not warranted in general, so an algorithm's implementation should not assume it to be in general, as given in your example, neither operation is warranted to compute to an equivalent value in any two arbitrary implementations (although hopefully consistent within their respective implementations). - No, exactly the opposite, the definition of an order of evaluation eliminates ambiguities, it does not prohibit anything other than the compiler applying optimizations which would otherwise alter the meaning of the specified expression. No, the optimizations do not alter the meaning of any C expression. If the meaning is undefined, then - yes I understand C/C++ etc. has chosen to define overflow and evaluation order (among a few other things) as being undefined. a) the programmer should not have written this rubbish. - or the language need not have enabled a potentially well defined expression to be
RE: Ada front-end depends on signed overflow
Original Message From: Paul Schlie Sent: 08 June 2005 14:40 - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? Random number generation? cheers, DaveK -- Can't think of a witty .sigline today
Re: Ada front-end depends on signed overflow
From: Dave Korn [EMAIL PROTECTED] Original Message From: Paul Schlie Sent: 08 June 2005 14:40 - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? Random number generation? - which if algorithmically computed relies on deterministic semantics. as otherwise the pseudo random distribution the algorithm is relying on may break.
RE: Ada front-end depends on signed overflow
Original Message From: Paul Schlie Sent: 08 June 2005 14:49 From: Dave Korn [EMAIL PROTECTED] Original Message From: Paul Schlie Sent: 08 June 2005 14:40 - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? Random number generation? - which if algorithmically computed relies on deterministic semantics. as otherwise the pseudo random distribution the algorithm is relying on may break. I didn't say Pseudo random number generation. I said Random number generation. cheers, DaveK -- Can't think of a witty .sigline today
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: From: Dave Korn [EMAIL PROTECTED] Original Message From: Paul Schlie Sent: 08 June 2005 - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? Random number generation? randomness has nothing whatever to do with non-determinisn. They are completely different concepts. But there are of course many examples. THere are many examples in concurrent programming where non-determinism is useful, and in set programming, arbitrary non-deterministic selection from a set is fundamental. But this is a complete red herring in this discussion The reason that for example in Ada we say that a+b means non-determinisitically either compute a then b, or b then a, is not that it is useful for these results to be different, but precisely because we expect NO reasonable program to ever write an expression a+b in which the two semantic meanings that are possible are different, and we want the compiler to take advantage of this to generate better code. For example a + f(b) we typically expect f(b) to be called first, even though formally it might be the case that f(b) modifies a, so this choice could have some effect from the formal non-determinism of the semantics. Our actual attitude is that if anyone writes code like this in which f(b) modifies a, they are highly incompetent and we don't care what happens.
Re: Ada front-end depends on signed overflow
Dave Korn wrote: I didn't say Pseudo random number generation. I said Random number generation. which once again has nothing whatever to do with non-determinism. TO illustrate this, suppose I have a language which has sets of integers. I have an operator ARB whose semantics is to select an element of such a set non-deterministically. It is a fine implementation to always return the smallest number in the set (of course no legitimate program can rely on this artifact of the implementation). It would not be at all fine to use this implementation for the RAND operator that selects a random element from the set. Many people mix these concepts up, it always causes trouble. I once remember a very senior member of the CS community (I will keep the name to myself), during a discussion of Ada semantics being dismayed at the overhead required for non-deterministic selection of an open SELECT alternative, since he assumed it meant that a random number generator would have to be used :-)
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: What's silly, is claiming that such operations are bit exact when even something as basic as their representational base radix number systems isn't even defined by the standard, nor need necessarily be the same between different FP types; thereby an arbitrary value is never always guaranteed to exactly representable as an FP value in all implementations (therefore test for equivalence with an arbitrary value is equally ambiguous, as would any operations on that value, unless it is known that within a particular implementation that it's value and any resulting intermediate operation values are correspondingly precisely representable, which is both implementation and target specific, although hopefully constrained to be as closely approximated as possible within it's representational constraints.) You are really just digging yourself into a hole here. It is clear that you know very little about floating-point arithmetic. If you are interested in learning, there are quite a lot of good references. I would suggest Michael Overton's new book as a good starting point. - Agreed, therefore because FP is inherently an imprecise representation, and bit exact equivalences between arbitrary real numbers and their representational form is not warranted, therefore should never be relied upon; therefore seems reasonable to enable optimizations which may alter the computed results as long as they are reasonably known to constrain the result's divergence to some few number least significant bits of precision. (as no arbitrary value is warranted to be representable, with the possible exception of some implementation/target specific whole number integer values, but who's overflow semantics are also correspondingly undefined.) There is nothing imprecise about IEEE floating-point operations - only if it's naively relied upon to be precise to some arbitrary precision, which as above is not warranted in general, so an algorithm's implementation should not assume it to be in general, as given in your example, neither operation is warranted to compute to an equivalent value in any two arbitrary implementations (although hopefully consistent within their respective implementations). More complete nonsense. Of course we do not rely on fpt operations being precise to arbitrary precision, we just expect well defined IEEE results which are defined to the last bit, and all modern hardware provides this capability. - yes I understand C/C++ etc. has chosen to define overflow and evaluation order (among a few other things) as being undefined. a) the programmer should not have written this rubbish. - or the language need not have enabled a potentially well defined expression to be turned into rubbish by enabling an implementation to do things like arbitrarily evaluate interdependent sub-expressions in arbitrary orders, or not require an implementation to at least optimize expressions consistently with their target's native semantics. Well it is clear that language designers generally disagree with you. Are you saying they are all idiots and you know better, or are you willing to try to learn why they disagree with you? - agreed, an operation defined as being undefined enables an implementation to produce an arbitrary result (which therefore is reliably useless). Distressing nonsense, sounds like you have learned nothing from this thread. Well hopefully others have, but anyway, last contribution from me, I think everything has been said that is useful.
Re: Ada front-end depends on signed overflow
- Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? Hm. int foo (const char *x, int y) { return printf (x, y); } Lassi -- If you would know the value of money, go try to borrow some. --Ben Franklin
Re: Ada front-end depends on signed overflow
Robert Dewar [EMAIL PROTECTED] writes: [...] | You are really just digging yourself into a hole here. It is clear | that you know very little about floating-point arithmetic. [...] | More complete nonsense. [...] | Are you saying they are all idiots and you know better, or are you | willing to try to learn why they disagree with you? [...] | Distressing nonsense, sounds like you have learned nothing from this | thread. Well hopefully others have, but anyway, last contribution from | me, I think everything has been said that is useful. Ahem. -- Gaby
Re: Ada front-end depends on signed overflow
From: Robert Dewar [EMAIL PROTECTED] Date: Wed, 08 Jun 2005 10:16:23 -0400 To: Paul Schlie [EMAIL PROTECTED] Cc: Florian Weimer [EMAIL PROTECTED], Andrew Pinski [EMAIL PROTECTED], GCC List gcc@gcc.gnu.org, [EMAIL PROTECTED] Subject: Re: Ada front-end depends on signed overflow You are really just digging yourself into a hole here. It is clear that you know very little about floating-point arithmetic. If you are interested in learning, there are quite a lot of good references. I would suggest Michael Overton's new book as a good starting point. - Agreed, therefore because FP is inherently an imprecise representation, and bit exact equivalences between arbitrary real numbers and their representational form is not warranted, therefore should never be relied upon; therefore seems reasonable to enable optimizations which may alter the computed results as long as they are reasonably known to constrain the result's divergence to some few number least significant bits of precision. (as no arbitrary value is warranted to be representable, with the possible exception of some implementation/target specific whole number integer values, but who's overflow semantics are also correspondingly undefined.) There is nothing imprecise about IEEE floating-point operations - agreed, however nor is it mandated by most language specifications, so seemingly irrelevant. - only if it's naively relied upon to be precise to some arbitrary precision, which as above is not warranted in general, so an algorithm's implementation should not assume it to be in general, as given in your example, neither operation is warranted to compute to an equivalent value in any two arbitrary implementations (although hopefully consistent within their respective implementations). More complete nonsense. Of course we do not rely on fpt operations being precise to arbitrary precision, we just expect well defined IEEE results which are defined to the last bit, and all modern hardware provides this capability. - as above (actually most, if inclusive of all processors in production, don't directly implement fully compliant IEEE FP math, although many closely approximate it, or simply provide no FP support at all; and as an aside: far more processors implement wrapping signed overflow semantics than provide IEEE fp support, as most do not differentiate between basic signed and unsigned 2's complement integer operations, so if expectations are based on likelihood of an arbitrary production processor supporting one vs. the other, one would expect wrapped overflow with a high likely-hood, and fully compliant IEEE support with a less likely-hood). a) the programmer should not have written this rubbish. - or the language need not have enabled a potentially well defined expression to be turned into rubbish by enabling an implementation to do things like arbitrarily evaluate interdependent sub-expressions in arbitrary orders, or not require an implementation to at least optimize expressions consistently with their target's native semantics. Well it is clear that language designers generally disagree with you. Are you saying they are all idiots and you know better, or are you willing to try to learn why they disagree with you? - I'm saying/implying nothing of that sort; as I happen to believe that the reason things are the way they are for the most part, is that although most knew better, the committees needed to politically accommodate varied implementation practices and assumptions, as otherwise would end up forcing some companies to invest a great deal of time and money to either re-implement their existing compilers, processor implementations, or application programs to accommodate a stricter set of specifications. (which most commercial organizations would lobby strongly against), which is one of the things that Java had the luxury of being able to somewhat side step. - agreed, an operation defined as being undefined enables an implementation to produce an arbitrary result (which therefore is reliably useless). Distressing nonsense, sounds like you have learned nothing from this thread. Well hopefully others have, but anyway, last contribution from me, I think everything has been said that is useful. - I would have if someone could provide a concrete example of an undefined behavior which produces a reliably useful/predictable result.
RE: Ada front-end depends on signed overflow
Original Message From: Paul Schlie Sent: 08 June 2005 15:53 From: Robert Dewar There is nothing imprecise about IEEE floating-point operations - agreed, however nor is it mandated by most language specifications, so seemingly irrelevant. I refer you to Annex F (normative) IEC 60559 floating-point arithmetic of the C language spec. Normative implies a mandate, does it not? F.1 Introduction 1 This annex specifies C language support for the IEC 60559 floating-point standard. The IEC 60559 floating-point standard is specifically Binary floating-point arithmetic for microprocessor systems, second edition (IEC 60559:1989), previously designated IEC 559:1989 and as IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE 7541985). IEEE Standard for Radix-Independent Floating-Point Arithmetic (ANSI/IEEE 8541987) generalizes the binary standard to remove dependencies on radix and word length. IEC 60559 generally refers to the floating-point standard, as in IEC 60559 operation, IEC 60559 format, etc. An implementation that defines __STDC_IEC_559__ conforms to the specifications in this annex. Where a binding between the C language and IEC 60559 is indicated, the IEC 60559-specified behavior is adopted by reference, unless stated otherwise. - as above (actually most, if inclusive of all processors in production, don't directly implement fully compliant IEEE FP math, although many closely approximate it, or simply provide no FP support at all; Pretty much every single ix86 and rs6000, and many m68 arch CPUs provide last-bit-exact IEEE implementations in hardware these days. Your statement is simply factually incorrect. cheers, DaveK -- Can't think of a witty .sigline today
Re: Ada front-end depends on signed overflow
Paul Schlie wrote on 08/06/2005 17:53:04: - I would have if someone could provide a concrete example of an undefined behavior which produces a reliably useful/predictable result. Well this is a simple hackery quiz, which is irrelevant to GCC. 1: int a, b; 2: int f() { return b++; } 3: int main(int argc) 4: { 5: b= argc; 6: a= b + f(); /* a==b*2 or a==b*2+1 */ 7: a= a/2; /* a=b */ 8: return a; 9: } If one would claim that a is totally unconstrained at line 6, then this example will be invalid. In that case, I can give a more restricted example, where 'a' is computed speculatively and is discarded in exactly the same cases when it is undefined. Oh, well here it is. 1: int a, b, c; 2: int f() { return a ? b++ : b; } 3: int main() 4: { 5: scanf(%d %d, a, b); 6: c= b + f(); /* C is undefined if a != 0*/ 7: if (a) c = b; 8: return c; 9: } This example is predictable. I argue that it may be also useful performance-wise to do speculative computations.
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: From: Robert Dewar [EMAIL PROTECTED] You keep saying this over and over, but it does not make it true. Once again, the whole idea of making certain constructs undefined, is to ensure that efficient code can be generated for well defined constructs. - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? The simplest example is a shift operation, x n, where the compiler may assume that the shift count is smaller than the width of the type. All sane machines agree on shift behaviour for 0 = n width, but there are differences between machines for n = width. Since this case is undefined by the language, it is ensured that shifts take only a single instruction on essentially all machines. Bernd
Re: Ada front-end depends on signed overflow
On Wed, Jun 08, 2005 at 10:53:04AM -0400, Paul Schlie wrote: From: Robert Dewar [EMAIL PROTECTED] There is nothing imprecise about IEEE floating-point operations - agreed, however nor is it mandated by most language specifications, so seemingly irrelevant. In real life, there are no longer any significant non-embedded architectures out there that don't use IEEE floating point, so it is a widely used practice to assume it and document the requirement. The resulting programs might not work on a Vax, Cray, or IBM 370. C'est la vie.
Re: Ada front-end depends on signed overflow
From: Joe Buck [EMAIL PROTECTED] On Wed, Jun 08, 2005 at 10:53:04AM -0400, Paul Schlie wrote: From: Robert Dewar [EMAIL PROTECTED] There is nothing imprecise about IEEE floating-point operations - agreed, however nor is it mandated by most language specifications, so seemingly irrelevant. In real life, there are no longer any significant non-embedded architectures out there that don't use IEEE floating point, so it is a widely used practice to assume it and document the requirement. The resulting programs might not work on a Vax, Cray, or IBM 370. C'est la vie. - With the not so minor exception that IEEE strict semantics are basically counter productive for most real-time fp signal processing tasks, as simple saturation and consistent reciprocal semantics tend to be preferred as they lose precision gracefully, not catastrophically as sticky inf and nan semantics do at the limits of it's representation bounds; which is why fp signal processor architectures tend not to implement IEEE semantics, and arguably rely on it's imprecision in lieu of failure at it's bounds. (also most embedded processors do not have any FP support, and often typically benefit from looser soft implementations of IEEE, as strict bit-exact behavior is typically of less significance when all that may be occasionally required is just a bit more dynamic range than an fixed point representation my reasonably provide, and often gladly trade a few bits of precision for a less bulky implementations and never need to fool with nan or inf semantics).
Re: Ada front-end depends on signed overflow
From: Bernd Schmidt [EMAIL PROTECTED] Paul Schlie wrote: From: Robert Dewar [EMAIL PROTECTED] You keep saying this over and over, but it does not make it true. Once again, the whole idea of making certain constructs undefined, is to ensure that efficient code can be generated for well defined constructs. - Can you give an example of an operation which may yield an undefined non-deterministic result which is reliably useful for anything? The simplest example is a shift operation, x n, where the compiler may assume that the shift count is smaller than the width of the type. All sane machines agree on shift behaviour for 0 = n width, but there are differences between machines for n = width. Since this case is undefined by the language, it is ensured that shifts take only a single instruction on essentially all machines. - How is it necessary or desirable to define that the result is undefined vs. being target defined? As undefined seems to be presumed to give an implementation the license to presume it will yield any specific value it chooses (such as 0 for example, and presume that value in subsequent optimizations), where the target implementation may actually yield something reliably different? For example given that an implementation knows how a runtime value shift value will be treated, it seems much more reasonable to treat compile time constant shift values consistently, as otherwise constant variable values will have different different semantics effects than non-constant variable values may? (Which doesn't seem like a good thing, or strictly necessary to enable an arbitrary implementation to efficiently implement consistent bit-shift semantics although it's behavior beyond implementation-independent bounds may differ between differing implementations?)
Re: Ada front-end depends on signed overflow
From: Michael Veksler [EMAIL PROTECTED] Paul Schlie wrote on 08/06/2005 17:53:04: - I would have if someone could provide a concrete example of an undefined behavior which produces a reliably useful/predictable result. Well this is a simple hackery quiz, which is irrelevant to GCC. 1: int a, b; 2: int f() { return b++; } 3: int main(int argc) 4: { 5: b= argc; 6: a= b + f(); /* a==b*2 or a==b*2+1 */ 7: a= a/2; /* a=b */ 8: return a; 9: } If one would claim that a is totally unconstrained at line 6, then this example will be invalid. In that case, I can give a more restricted example, where 'a' is computed speculatively and is discarded in exactly the same cases when it is undefined. - unless I misunderstand, it actually is undefined if the value of argc may be as large as the largest representable positive int value, if both signed int overflows and evaluation order are undefined/unspecified. (although confess to missing your point, as it's not clear how an undefined behavior is productively contributing to the observation that a == argc if agc++ doesn't overflow or evaluated l-r; as opposed to observing that if the evaluation order were fixed l-r and not undefined, then the optimization may be derived?) Oh, well here it is. 1: int a, b, c; 2: int f() { return a ? b++ : b; } 3: int main() 4: { 5: scanf(%d %d, a, b); 6: c= b + f(); /* C is undefined if a != 0*/ 7: if (a) c = b; 8: return c; 9: } This example is predictable. I argue that it may be also useful performance-wise to do speculative computations. - I believe it's subject to the same problem?
Re: Ada front-end depends on signed overflow
Paul Schlie [EMAIL PROTECTED] wrote on 08/06/2005 21:16:46: From: Michael Veksler [EMAIL PROTECTED] Paul Schlie wrote on 08/06/2005 17:53:04: - I would have if someone could provide a concrete example of an undefined behavior which produces a reliably useful/predictable result. Well this is a simple hackery quiz, which is irrelevant to GCC. 1: int a, b; 2: int f() { return b++; } 3: int main(int argc) 4: { 5: b= argc; 6: a= b + f(); /* a==b*2 or a==b*2+1 */ 7: a= a/2; /* a=b */ 8: return a; 9: } If one would claim that a is totally unconstrained at line 6, then this example will be invalid. In that case, I can give a more restricted example, where 'a' is computed speculatively and is discarded in exactly the same cases when it is undefined. - unless I misunderstand, it actually is undefined if the value of argc may be as large as the largest representable positive int value, if both signed int overflows and evaluation order are undefined/unspecified. It is undefined due to evaluation order. Is it: tmp= b; // b is fetched first. a= tmp + b++; // then f() is called and added or is it: tmp= b++; // f() is called first. a= b + tmp; // then b is fetched and added Please note that this has nothing to do with overflow. I argue that although 'a' is undefined due to evaluation order, it may still be used as long as its undefined-ness is constrained (which the std may or may not guarantee). Anyway, as I said, it is more of a quiz than a practical question and example. You may come up with a bizarre example where it makes some sense. (although confess to missing your point, as it's not clear how an undefined behavior is productively contributing to the observation that a == argc if agc++ doesn't overflow or evaluated l-r; as opposed to observing that if the evaluation order were fixed l-r and not undefined, then the optimization may be derived?) Now it is me who missed the above point. What is 1-r ? Oh, well here it is. 1: int a, b, c; 2: int f() { return a ? b++ : b; } 3: int main() 4: { 5: scanf(%d %d, a, b); 6: c= b + f(); /* C is undefined if a != 0*/ 7: if (a) c = b; 8: return c; 9: } This example is predictable. I argue that it may be also useful performance-wise to do speculative computations. - I believe it's subject to the same problem? No, here also the result is undefined due to evaluation order. My point in this example is that you may operate on 'c' which is defined for some inputs and undefined for others, and still get an observably consistent results. If you know that the result may be undefined only when you do not intend to use it, then you don't care about it (if guaranteed not to get SIGBUS). As other have noted, the effects of constraining the compiler to platform defined behavior, you lose performance. For example: double a, b; int f(); int main() { int i; for (i=0 ; i 100 ; ++i) b += (double)f() + sin(a); return foo(b); } Let's assume that on a given platform the FPU can calculate sin(a) in parallel to f(). The compiler cannot prove that f() does not modify a, because f() is not given here or because it is intractable. If, calls to f() and sin(a) take 10 ns each, then calculating them in parallel will save 50% of the total run-time. To do that, the compiler will have to change the order of function calls. You may argue that even a x100 speed up is not worth an undefined behavior. I will argue that in this case most programmers will avoid modifying 'a' inside f() anyway (due to maintainability) and will avoid the undefined result. I think that normative code should be penalized for the possibility of bizarre code. This bizarre code should be fixed anyway to avoid maintenance nightmares.
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: - How is it necessary or desirable to define that the result is undefined vs. being target defined? What does C say about how a target performs an instruction? And why shouldn't GCC take advantage of this?
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: - Agreed, I would classify any expression as being ambiguous if any of it's operand values (or side effects) were sensitive to the allowable order of evaluation of it's remaining operands, but not otherwise. But this predicate cannot be evaluated at compile time! Now you seem to suggest that the optimizer should simply avoid optimizing in such cases (where it matters). - No, I simply assert that if an expression is unambiguous (assuming my definition above for the sake of discussion), then the compiler may choose to order the evaluation in any way it desires as long as it does not introduce an such an ambiguity by doing so. But this predicate cannot be evaluated at compile time! - I fully agree that if a complier does not maintain records of the program state which a function may alter or be dependant on, as would be required to determine if any resulting operand/side-effect interdependences may exist upon it's subsequent use as an operand within a an expression itself; then the compiler would have no choice but to maintain it's relative order of evaluation as hypothetically specified, as it may otherwise introduce an ambiguity. Fine, but then you are designing a different language from C. It is fine to argue this language point in the context of langage design, but this is irrelevant to the discussion of the implementation of C, since the language is already defined, and the design decision is contrary to what you want. Any C programmer who programs with the conceptions you suggest is simply not a competent C programmer. Note also the critical word in your above paragraph: may. That's the whole point, the compiler can't tell, and if it has to make worst case assumptions, the impact on code efficiency is significant. SO it is no problem for the compiler to maintain records ..., but it is not good enough (please reread my examples in the previous message!) Although I believe I appreciate the relative complexity this introduces to both the compiler, and well as requirements imposed on pre-compiled libraries, etc., I don't believe that it justifies a language definition legitimizing the specification of otherwise non-deterministic programs. Fine, as I say, an argument for some other forum. - As you've specified the operations as distinct statements, I would argue that such an optimization would only be legitimate if the result were known to produce the same result as if the statements were evaluated in sequence as specified by the standard (which of course would be target specific). You can argue this all you like, but it is just a plain wrong argument for C which is a language defined by the ANSI/ISO standard, not by Paul Schlie. d = (a + b) / 8; would be ambiguous if the complier were able to restructure evaluation of expression in any way which may alter it's resulting effective result for a given target, As a program which has non-deterministic behavior doesn't seem very useful That's a judgment with which I, many others, and the designers of C disagree, so too bad for you! Now it is legitimate to argue about how much quality is hurt, and whether the resulting non-determinisim is worth the effeciency hit. - Or rather is non-determinisim ever legitimately acceptable? (as I can't honestly think of a single instance were it would be, except if it may only result in the lost of a measurably few bits of fp precision, which are imprecise representations to begin with. but not otherwise?) If you can't appreciate the argument on the other side, you can't very effectively argue your own position. Most language designers are quite ready to accept non-determinate behavior in peculiar cases to ensure that the common cases can be compiled efficiently. The basic conclusion in design of such languages (which includes C, C++, Ada, Fortran, Pasal, PL/1, etc) is that no reasonable programmer writes the kind of side effects that cause trouble, so why cripple efficiency for all reasonable programmers. Really the only language in common use that follows your line of thinking is Java, and that is a case where a very concious decision is made to sacrifice efficiency for reproducibility and portability of peculiar cases. But overall do agree with your earlier statement, that each language has made a choice, for better or worse. Yes, of course, and the job of the compiler writer is to implement the languages with the choice it made, not second guess it. Interesting postscript. Even APL as originally designed had undefined order of operations. However, early implementations were naive interpretors which not only associated right to left (as required by the language) but also executed in this order (which was not required). Programmers got used to expecting this order and relying on it (partly because it worked and most of them did not even know it was wrong to rely on it, and partly because people got confused
Re: Ada front-end depends on signed overflow
* Robert Dewar: Defintiely not, integer overflow generates async traps on very few architectures. Certainly synchronous traps can be generated (e.g. with into on the x86). Or the JO jump instruction. Modern CPUs choke on the INTO instruction.
Re: Ada front-end depends on signed overflow
Florian Weimer wrote: * Robert Dewar: Defintiely not, integer overflow generates async traps on very few architectures. Certainly synchronous traps can be generated (e.g. with into on the x86). Or the JO jump instruction. Modern CPUs choke on the INTO instruction. Well INTO is still the best choice if you worry about code size. Furthermore, it would be interesting to know what the actual pragmatic effect of choke is, by comparing: 1. no checks 2. current software checks 3. into style checks 4. checks using jo on real applications (not an easy test to do unfortunately)
Re: Ada front-end depends on signed overflow
From: Robert Dewar [EMAIL PROTECTED] Paul Schlie wrote: Fine, but then you are designing a different language from C. - I'm not attempting to design a language, but just defend the statement that I made earlier; which was in effect that I contest the assertion that undefined evaluation semantics enable compilers to generate more efficient useful code by enabling them to arbitrarily destructively alter evaluation order of interdependent sub-expressions, and/or base the optimizations on behaviors which are not representative of their target machines. Because I simply observe that since an undefined behavior may also be non-deterministic even within a single program, it can't be relied upon; therefore enabling a compiler to produce garbage more efficiency is seems basically worthless, and actually even dangerous when the compiler can't even warn about resulting potentially non-deterministic ambiguities because it can't differentiate between garbage and reliably deterministic useful code, as it considers them equivalently legitimate. (With an exception being FP optimization, as FP is itself based only on the approximate not absolute representation of values.) - Agreed, I would classify any expression as being ambiguous if any of it's operand values (or side effects) were sensitive to the allowable order of evaluation of it's remaining operands, but not otherwise. But this predicate cannot be evaluated at compile time! - Why not? The compiler should be able to statically determine if an expression's operands are interdependent, by determining if any of its operand's sub-expressions are themselves dependant on a variable value potentially modifiable by any of the other operand's sub- expressions. (Which is basically the same constraint imposed when rescheduling instructions, as an assignment can not be moved passed a reference to the same variable value, without potentially corrupting the effective semantics of the specified program, but may freely re-schedule assignments and references to distinct variable values past each other without any restrictions safely.)
Re: Ada front-end depends on signed overflow
* Paul Schlie: - I'm not attempting to design a language, but just defend the statement that I made earlier; which was in effect that I contest the assertion that undefined evaluation semantics enable compilers to generate more efficient useful code by enabling them to arbitrarily destructively alter evaluation order of interdependent sub-expressions, and/or base the optimizations on behaviors which are not representative of their target machines. But the assertion is trivially true. If you impose fewer constraints on an implementation by leaving some cases undefined, it always has got more choices when generating code, and some choices might yield better code. So code generation never gets worse. Whether an implementation should exercise the liberties granted by the standard in a particular case is a different question, and has to be decided on a case-by-case basis. (With an exception being FP optimization, as FP is itself based only on the approximate not absolute representation of values.) FP has well-defined semantics, and its absolutely required for compilers to implement them correctly because otherwise, a lot of real-world code will break. Actually, this is a very interesting example. You don't care about proper floating point arithmetic and are willing to sacrifice obvious behavior for a speed or code size gain. Others feel the same about signed integer arithmetic. - Agreed, I would classify any expression as being ambiguous if any of it's operand values (or side effects) were sensitive to the allowable order of evaluation of it's remaining operands, but not otherwise. But this predicate cannot be evaluated at compile time! - Why not? In general, it's undecidable. The compiler should be able to statically determine if an expression's operands are interdependent, by determining if any of its operand's sub-expressions are themselves dependant on a variable value potentially modifiable by any of the other operand's sub- expressions. Phrased this way, you make a lot of code illegal. I doubt this is feasible.
Re: Ada front-end depends on signed overflow
* Paul Schlie: But the assertion is trivially true. If you impose fewer constraints on an implementation by leaving some cases undefined, it always has got more choices when generating code, and some choices might yield better code. So code generation never gets worse. - yes, it certainly enables an implementation to generate more efficient code which has no required behavior; so in effect basically produce more efficient programs which don't reliably do anything in particular; which doesn't seem particularly useful? The quality of an implementation can't be judged only based on its conformance to the standard, but this does not mean that the implementation gets better if you introduce additional constraints which the standard doesn't impose. Some people want faster code, others want better debugging information. A few people only want optimizations which do not change anything which is pracitcally observable but execution time (which is a contradiction), and so on. - Essentially yes; as FP is an approximate not absolute representation of a value, therefore seems reasonable to accept optimizations which may result in some least significant bits of ambiguity. But the same is true for C's integers, they do not behave like the real thing. Actually, without this discrepancy, we wouldn't have to worry about overflow semantics, which once was the topic of this thread! The compiler should be able to statically determine if an expression's operands are interdependent, by determining if any of its operand's sub-expressions are themselves dependant on a variable value potentially modifiable by any of the other operand's sub- expressions. Phrased this way, you make a lot of code illegal. I doubt this is feasible. - No, exactly the opposite, the definition of an order of evaluation eliminates ambiguities, it does not prohibit anything other than the compiler applying optimizations which would otherwise alter the meaning of the specified expression. Ah, so you want to prescribe the evaluation order and allow reordering under the as-if rule. This wasn't clear to me, sorry. It shouldn't be too hard to implement this (especially if your order matches the Java order), so you could create a switch to fit your needs. I don't think it should be enabled by default because it encourages developers to write non-portable code which breaks when compiled with older GCC version, and it inevitably introduces a performance regression on some targets.
Re: Ada front-end depends on signed overflow
* Paul Schlie: (With an exception being FP optimization, as FP is itself based only on the approximate not absolute representation of values.) Floating-point arithmetic is not simply some inaccurate representation of real arithmetic. It can be used this way by the programmer, but in fact fpt operations have very well defined semantics, and compiler writers have to be very careful not to intefere with these semantics beyond the level permitted by the language. Certainly the above quoted attitude would be pretty deadly if held by a compiler optimizer writer!
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: - yes, it certainly enables an implementation to generate more efficient code which has no required behavior; so in effect basically produce more efficient programs which don't reliably do anything in particular; which doesn't seem particularly useful? You keep saying this over and over, but it does not make it true. Once again, the whole idea of making certain constructs undefined, is to ensure that efficient code can be generated for well defined constructs. - Essentially yes; as FP is an approximate not absolute representation of a value, therefore seems reasonable to accept optimizations which may result in some least significant bits of ambiguity. Rubbish, this shows a real misunderstanding of floating-point. FP values are not approximations, they are well defined values in a system of arithmetic with precisely defined semantics, just as well defined as integer operations. Any compiler that followed your misguided ideas above would be a real menace and completely useless for any serious fpt work. As it is, the actual situation is that most serious fpt programmers find themselves in the same position you are with integer arithmetic. They often don't like the freedom given by the language to e.g. allow extra precision (although they tend to be efficiency hungry, so one doesn't know in practice that this is what they really want, since they want it without paying for it, and they don't know how much they would have to pay :-) Where integer operations are relied upon for state representations, which are in general must remain precisely and deterministically calculated, as otherwise catastrophic semantic divergences may result. Right, and please if you want to write integer operations, you must ensure that you write only defined operations. If you write a+b and it overflows, then you have written a junk C program, and you have no right to expect anything whatsoever from the result. This is just another case of writing a bug in your program, and consequently getting results you don't expect. By the way, I know we are hammering this stuff (and Paul) a bit continuously here, but these kind of misconceptions are very common among programmers who do not understand as much as they should about language design and compilers. I find I have to spend quite a bit of time in a graduate compiler course to make sure everyone understands what undefined semantics are all about. (i.e. a single lsb divergence in an address calculation is not acceptable although an similar divergence in a FP value is likely harmless.) Nonsense, losing the last bit in an FP value can be fatal to many algorithms. Indeed, some languages allow what seems to FP programmers to be too much freedom, but not for a moment can a compiler writer contemplate doing an optimization which is not allowed. For instance, in general replacing (a+b)+c by a+(b+c) is an absolute no-no in most languages. - No, exactly the opposite, the definition of an order of evaluation eliminates ambiguities, it does not prohibit anything other than the compiler applying optimizations which would otherwise alter the meaning of the specified expression. No, the optimizations do not alter the meaning of any C expression. If the meaning is undefined, then a) the programmer should not have written this rubbish b) any optimization leaves the semantics undefined, and hence unchanged. Furthermore, the optimizations are not about undefined expressions at all, they are about generating efficient code for cases where the expression has a well defined value, but the compiler cannot prove an as-if relation true if the notion of undefined expressions is not present.
Re: Ada front-end depends on signed overflow
On Tue, Jun 07, 2005 at 05:49:54PM -0400, Robert Dewar wrote: * Paul Schlie: (With an exception being FP optimization, as FP is itself based only on the approximate not absolute representation of values.) Floating-point arithmetic is not simply some inaccurate representation of real arithmetic. It can be used this way by the programmer, but in fact fpt operations have very well defined semantics, and compiler writers have to be very careful not to intefere with these semantics beyond the level permitted by the language. Certainly the above quoted attitude would be pretty deadly if held by a compiler optimizer writer! Exactly. I have written fixed-point packages as well as expression manipulation packages that are based on the exact behavior of IEEE floating point. I have had running battles in the past (not recently) with people who think that GCC should warn whenever == is applied to float or double expressions. There are some faults that we just have to live with (like the uncontrolled extra precision on the x86, depending on whether a temporary is spilled to memory or not), but programs can and do depend on the fact that certain values and computations are represented precisely by floating point arithmetic.
Re: Ada front-end depends on signed overflow
There's also a fair amount of code whih relies on -1 == (int)0x. Or is there any truly portable and efficient way to convert a sequence of bytes (in big-endian order) to a signed integer? Of course there is. Assuming no padding bits: int conv(unsigned char *c) { unsigned int i, u, hibit; hibit = ~0U; hibit ^= (hibit 1); u = 0; for (i = 0; i sizeof u; i++) u = (u 8) + c[i]; if ((u hibit) == 0U) return u; u -= hibit; if ((unsigned int)-1 == (hibit | 1U)) return -(int)u; return (int)u - (int)(~hibit) - ((unsigned int)-1 1U); } which generates _conv: li r2,4 li r0,0 mtctr r2 L11: lbz r2,0(r3) slwi r0,r0,8 addi r3,r3,1 add r0,r0,r2 bdnz L11 mr r3,r0 blr with GCC 3.3 on Darwin, and .conv: li 9,4 li 0,0 li 11,0 mtctr 9 .p2align 4,,15 .L2: lbzx 9,3,11 slwi 0,0,8 nop nop addi 11,11,1 add 0,0,9 nop nop rldicl 0,0,0,32 nop nop nop nop bdnz .L2 extsw 3,0 nop nop nop nop blr with a GCC-4.1.0 snapshot on powerpc64-unknown-linux-gnu (lots of inefficiencies here, but nothing to do with the conversion itself). Sorry, I couldn't test it on a ones' complement or sign-magnitude machine -- just trust me it works (or embarrass me in public, if a bug sneaked in while converting this to C) ;-) Segher
Re: Ada front-end depends on signed overflow
Once again, have you actually examined how awtul the code we generate now is? Yes, I have. Indeed not pretty, but suppose that we managed to cut the overhead in half, would that make -gnato really more attractive? Well of course that's just a plain bug, should be addressed as such. Obviously no one is using -ftrapv, so it will expose lots of bugs I would guess. Yes, that's my impression too. Clearly the status quo is entirely unacceptable, so what's your recommendation of how we generate at least vaguely respectable code for overflow checking Tough question, but I'm almost certain of something: since we need to recover from overflow checks, we have to expose them to the EH machinery, which nowadays means in the GENERIC tree IL. From that, we have 2 alternatives: synchronous or asynchronous exceptions. The current implementation is synchronous, as we explicitly raise exceptions in the code by calling a routine. I guess you're pushing for asynchronous exceptions since this is probably the only efficient approach, i.e. we rely on the hardware to trap and try to recover from that. If so, currently tree_could_trap_p will return true for all arithmetic operations whose type is TYPE_TRAP_SIGNED, with #define TYPE_TRAP_SIGNED(NODE) \ (flag_trapv ! TYPE_UNSIGNED (NODE)) That seems far too broad. We could instead flag expressions on a case-by-case basis, but I guess that would put too much burden on the tree optimizers. So a compromise could be to use a specific bit for TYPE_TRAP_SIGNED and instruct Gigi to use the TYPE_TRAP_SIGNED-variant of a given type, when overflow checks are requested for an operation. Then we would need to implement the overflow-aware instruction patterns for every architecture. AFAICS only Alpha has (some of) them. And fix the RTL optimizers in the process. -- Eric Botcazou
Re: Ada front-end depends on signed overflow
On 6/6/05, Segher Boessenkool [EMAIL PROTECTED] wrote: There's also a fair amount of code whih relies on -1 == (int)0x. Or is there any truly portable and efficient way to convert a sequence of bytes (in big-endian order) to a signed integer? Of course there is. Assuming no padding bits: [snip complicated stuff] Better use a union for the (final) conversion, i.e int conv(unsigned char *c) { unsigned int i; union { unsigned int u; int i; } u; u.u = 0; for (i = 0; i sizeof u; i++) u.u = (u.u 8) + c[i]; return u.i; } or even (if you can determine native byte-order and size at compile time) int conv(unsigned char *c) { union { unsigned char c[4]; int i; } x; int i; for (int i=0; i4; ++i) x.c[3-i] = c[i]; return x.i; } which generates only slightly worse code than above. Richard.
Re: Ada front-end depends on signed overflow
Better use a union for the (final) conversion, i.e int conv(unsigned char *c) { unsigned int i; union { unsigned int u; int i; } u; u.u = 0; for (i = 0; i sizeof u; i++) u.u = (u.u 8) + c[i]; return u.i; } This is not portable, though; accessing a union member other than the member last stored into is unspecified behaviour (see J.1 and 6.2.6.1). This is allowed (and defined behaviour) as a GCC extension, though. Segher
Re: Ada front-end depends on signed overflow
From: Andrew Pinski [EMAIL PROTECTED] No they should be using -ftrapv instead which traps on overflow and then make sure they are not trapping when testing. - why? what language or who's code/target ever expects such a behavior? Everyone's who writes C/C++ should know that overflow of signed is undefined. Now in Java it is defined, which is the reason why -fwrapv exists in the place since GCC has a Java compiler. I think you need to go back in the archives and read the disscusions about when -fwrapv was added and see why it is not turned on by default for C. http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00850.html http://gcc.gnu.org/ml/gcc-patches/2003-03/msg02126.html http://gcc.gnu.org/ml/gcc-patches/2003-03/msg01727.html Thank again, upon fully reviewing the threads I still conclude: - C/C++ defines integer overflow as undefined because it's a target specific behavior, just as dereferencing a NULL is (although a large majority of targets factually do wrap overflow, and don't terminally trap NULL dereferences; so GCC's got it backwards in both cases). - So technically as such semantics are undefined, attempting to track and identify such ambiguities is helpful; however the compiler should always optimize based on the true semantics of the target, which is what the undefined semantics truly enable (as pretending a target's semantics are different than the optimization assumptions, or forcing post-fact run-time trapping semantics, are both useless and potentially worse, inefficient and/or erroneous otherwise).
Re: Ada front-end depends on signed overflow
On 6/6/05, Segher Boessenkool [EMAIL PROTECTED] wrote: Better use a union for the (final) conversion, i.e int conv(unsigned char *c) { unsigned int i; union { unsigned int u; int i; } u; u.u = 0; for (i = 0; i sizeof u; i++) u.u = (u.u 8) + c[i]; return u.i; } This is not portable, though; accessing a union member other than the member last stored into is unspecified behaviour (see J.1 and 6.2.6.1). This is allowed (and defined behaviour) as a GCC extension, though. I guess this invokes well-defined behavior on any sane implementation, as otherwise treatment of a pointer to such union would be different to that of a pointer to a member of such union. Aka you would get undefined behavior once you read bytes from a file and access them as different types. Also this technique is cited to circumvent type aliasing issues for f.i. doing bit-twiddling of floats on its integer representation. But I guess following the standard, you are right :( Richard.
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: - So technically as such semantics are undefined, attempting to track and identify such ambiguities is helpful; however the compiler should always optimize based on the true semantics of the target, which is what the undefined semantics truly enable (as pretending a target's semantics are different than the optimization assumptions, or forcing post-fact run-time trapping semantics, are both useless and potentially worse, inefficient and/or erroneous otherwise). The first part of this is contentious, but arguable certainly (what is useful behavior). There is certainly no requirement that the semantics should match that of the target, especially since that's ill-defined anyway (for targets that have many different kinds of arithemetic instructions). The second part is wrong, it is clear that there are cases where the quality of code can be improved by really taking advantage of the undefinedness of integer overflow.
Re: Ada front-end depends on signed overflow
From: Robert Dewar [EMAIL PROTECTED] Paul Schlie wrote: - So technically as such semantics are undefined, attempting to track and identify such ambiguities is helpful; however the compiler should always optimize based on the true semantics of the target, which is what the undefined semantics truly enable (as pretending a target's semantics are different than the optimization assumptions, or forcing post-fact run-time trapping semantics, are both useless and potentially worse, inefficient and/or erroneous otherwise). The first part of this is contentious, but arguable certainly (what is useful behavior). There is certainly no requirement that the semantics should match that of the target, especially since that's ill-defined anyway (for targets that have many different kinds of arithemetic instructions). - I don't mean to contest the standard which specifies the behavior is undefined (regardless of how useless I perceive that to be), but merely observe that in fact as most targets do implement 2's complement modulo 2^N integer arithmetic, and given that overflow behavior is undefined, it makes makes no sense to presume otherwise (as such a behavior is both fully compliant and factually typical of most, if not near all, targets). The second part is wrong, it is clear that there are cases where the quality of code can be improved by really taking advantage of the undefinedness of integer overflow. - As above; but to whom is it useful to compute an undefined result more efficiently, especially if the premise of an optimization is not factually consistent with the target's behavior (which will surely result in an incorrectly predicted, therefore likely computationally ambiguous/useless behavior)? Similar arguments has been given in support an undefined order of evaluation; which is absurd, as the specification of a semantic order of evaluation only constrains the evaluation of expressions which would otherwise be ambiguous, as expressions which are insensitive to their order of evaluation may always be evaluated in any order regardless of a specified semantic order of evaluation and yield the same result; so in effect, defining an order of evaluation only disambiguates expression evaluation, and does not constrain the optimization of otherwise unambiguous expressions.
Re: Ada front-end depends on signed overflow
Paul Schlie wrote: Similar arguments has been given in support an undefined order of evaluation; which is absurd, as the specification of a semantic order of evaluation only constrains the evaluation of expressions which would otherwise be ambiguous, as expressions which are insensitive to their order of evaluation may always be evaluated in any order regardless of a specified semantic order of evaluation and yield the same result; so in effect, defining an order of evaluation only disambiguates expression evaluation, and does not constrain the optimization of otherwise unambiguous expressions. I think perhaps you have not really looked at the issues that are raised in optimizing compilers. Let me address this misconception first, similar considerations apply to the overflow case. The point is that at compile time you cannot tell what expressions are ambiguous. I use quotes here since of course there is no ambiguity involved: suppose you write (a * f(b)) * (c * g(d)) where f(b) modifies c and g(d) modifies a. You would call this ambiguous but in fact the semantics is undefined, so it is not ambiguous at all, it has quite unambiguous semantics, namely undefined. The notion of non-deterministic behavior is of course quite familiar, and what undefined means here is that the behavior is non-deterministic from among the set of all possible behaviors. Now you seem to suggest that the optimizer should simply avoid optimizing in such cases (where it matters). But the whole point is that of course at compile time in general you can't tell whether f(b) and g(d) have (horrible) side effects of modifying global variables a and c. If we don't allow undefined behavior, the optimizer would have to assume the worst, and refrain from optimizing the above expression just in case some idiot had written side effects. The decision of the standards committee is to make the side effect case undefined PRECISELY so that the optimizer can go ahead and choose the optimal order of evaluation without worrying about the side effect case. Going back to the overflow case, consider: a = (some expression the compiler can tell is positive) b = (some expression the compiler can tell is positive) c = a + b; d = c / 8; Here the compiler can do a right shift for the last statement. This would of course be wrong in the general case (assuming c and d are int), since the right shift does not work right if c is negative. But the compiler is allowed to assume that a+b does not overflow and hence the result is positive, and hence the shift is fine. SO you see that guaranteeing twos complement wrap can hurt the quality of the code in a case like this. Now it is legitimate to argue about how much quality is hurt, and whether the resulting non-determinisim is worth the effeciency hit. But to conduct this argumnnt, you have to start off with an understanding that there really is a trade-off and it is not clear what the decision should be (C makes one decision, Java another, and Ada a third). The three languages also differ in the handling of a+b C says that this is undefined if there are side effects that cause what you call ambiguity. Java specifies left to right evaluation (evaluate a before b always). Ada specifies that the result is non-deterministic, you either evaluate a before b or b before a (not some mixture of the two), and there may be (at most) two possible results.
Re: Ada front-end depends on signed overflow
Eric Botcazou wrote: Once again, have you actually examined how awtul the code we generate now is? Yes, I have. Indeed not pretty, but suppose that we managed to cut the overhead in half, would that make -gnato really more attractive? Yes, it would definitely make the difference, given the figures we have seen. From that, we have 2 alternatives: synchronous or asynchronous exceptions. The current implementation is synchronous, as we explicitly raise exceptions in the code by calling a routine. I guess you're pushing for asynchronous exceptions since this is probably the only efficient approach, i.e. we rely on the hardware to trap and try to recover from that. Defintiely not, integer overflow generates async traps on very few architectures. Certainly synchronous traps can be generated (e.g. with into on the x86).
Re: Ada front-end depends on signed overflow
From: Robert Dewar [EMAIL PROTECTED] Paul Schlie wrote: Similar arguments has been given in support an undefined order of evaluation; which is absurd, as the specification of a semantic order of evaluation only constrains the evaluation of expressions which would otherwise be ambiguous, as expressions which are insensitive to their order of evaluation may always be evaluated in any order regardless of a specified semantic order of evaluation and yield the same result; so in effect, defining an order of evaluation only disambiguates expression evaluation, and does not constrain the optimization of otherwise unambiguous expressions. I think perhaps you have not really looked at the issues that are raised in optimizing compilers. Let me address this misconception first, similar considerations apply to the overflow case. The point is that at compile time you cannot tell what expressions are ambiguous. I use quotes here since of course there is no ambiguity involved: suppose you write (a * f(b)) * (c * g(d)) where f(b) modifies c and g(d) modifies a. You would call this ambiguous but in fact the semantics is undefined, so it is not ambiguous at all, it has quite unambiguous semantics, namely undefined. The notion of non-deterministic behavior is of course quite familiar, and what undefined means here is that the behavior is non-deterministic from among the set of all possible behaviors. - Agreed, I would classify any expression as being ambiguous if any of it's operand values (or side effects) were sensitive to the allowable order of evaluation of it's remaining operands, but not otherwise. Now you seem to suggest that the optimizer should simply avoid optimizing in such cases (where it matters). - No, I simply assert that if an expression is unambiguous (assuming my definition above for the sake of discussion), then the compiler may choose to order the evaluation in any way it desires as long as it does not introduce an such an ambiguity by doing so. But the whole point is that of course at compile time in general you can't tell whether f(b) and g(d) have (horrible) side effects of modifying global variables a and c. If we don't allow undefined behavior, the optimizer would have to assume the worst, and refrain from optimizing the above expression just in case some idiot had written side effects. The decision of the standards committee is to make the side effect case undefined PRECISELY so that the optimizer can go ahead and choose the optimal order of evaluation without worrying about the side effect case. - I fully agree that if a complier does not maintain records of the program state which a function may alter or be dependant on, as would be required to determine if any resulting operand/side-effect interdependences may exist upon it's subsequent use as an operand within a an expression itself; then the compiler would have no choice but to maintain it's relative order of evaluation as hypothetically specified, as it may otherwise introduce an ambiguity. Although I believe I appreciate the relative complexity this introduces to both the compiler, and well as requirements imposed on pre-compiled libraries, etc., I don't believe that it justifies a language definition legitimizing the specification of otherwise non-deterministic programs. Going back to the overflow case, consider: a = (some expression the compiler can tell is positive) b = (some expression the compiler can tell is positive) c = a + b; d = c / 8; Here the compiler can do a right shift for the last statement. This would of course be wrong in the general case (assuming c and d are int), since the right shift does not work right if c is negative. But the compiler is allowed to assume that a+b does not overflow and hence the result is positive, and hence the shift is fine. SO you see that guaranteeing twos complement wrap can hurt the quality of the code in a case like this. - As you've specified the operations as distinct statements, I would argue that such an optimization would only be legitimate if the result were known to produce the same result as if the statements were evaluated in sequence as specified by the standard (which of course would be target specific). Correspondingly I would assert that: d = (a + b) / 8; would be ambiguous if the complier were able to restructure evaluation of expression in any way which may alter it's resulting effective result for a given target, As a program which has non-deterministic behavior doesn't seem very useful, regardless of whether or not it's allowed by a standard or not. (Although concede that some optimizations have more benign worst-case effects than others, and may be reasonable if explicitly enabled (aka unsafe fp math); however as described above, this one seems likely more
Re: Ada front-end depends on signed overflow
Eric Botcazou wrote: -ftrapv is not practically usable because (1) it generates awful code and (2) it badly interacts with the RTL optimizers. please before you say this compare it with the truly awful front end code we generate, which for sure inteferes badly with the optimizers. Right, the code generated by the front end is not pretty either, but at least the front-end knows what it is doing Ada-wise. -ftrapv is so dumb at the moment that it emits checks for virtually anything. No, it just emits it for signed operations. If we use -ftrapv, then we have to make sure that operations labeled by the front end as not requiring an overflow check are transformed into unsigned operations by gigi. As for the interaction with the optimizers: int foo(int a, int b) { return a + b; } gcc -S -O -ftrapv on SPARC: foo: jmp %o7+8 add%o0, %o1, %o0 .size foo, .-foo .ident GCC: (GNU) 4.1.0 20050604 (experimental) And I have a slightly more contrived example with the same problem at -O0! I don't see that's so terrible, the jmp will be free in practice anyway so I don't think you will find this slows things down. I think we cannot use -ftrapv alone for Ada because it is too low-level. Its general mechanism certainly can help (once it is fixed) but it must be driven by something more Ada-aware. Again, please compare it with the simply awful mechanism used by the front end right now.
Re: Ada front-end depends on signed overflow
Right, the code generated by the front end is not pretty either, but at least the front-end knows what it is doing Ada-wise. -ftrapv is so dumb at the moment that it emits checks for virtually anything. No, it just emits it for signed operations. Of course, it is not so dumb as to blatantly violate its specification. If we use -ftrapv, then we have to make sure that operations labeled by the front end as not requiring an overflow check are transformed into unsigned operations by gigi. I think it would be dangerous to take that path. I don't see that's so terrible, the jmp will be free in practice anyway so I don't think you will find this slows things down. You missed the point; the overflow check has been optimized away by one of the RTL optimization passes. -- Eric Botcazou
Re: Ada front-end depends on signed overflow
Eric Botcazou wrote: Of course, it is not so dumb as to blatantly violate its specification. If we use -ftrapv, then we have to make sure that operations labeled by the front end as not requiring an overflow check are transformed into unsigned operations by gigi. I think it would be dangerous to take that path. It is unhelpful to make such statements without justification. Once again, have you actually examined how awtul the code we generate now is? I don't see that's so terrible, the jmp will be free in practice anyway so I don't think you will find this slows things down. You missed the point; the overflow check has been optimized away by one of the RTL optimization passes. Well of course that's just a plain bug, should be addressed as such. Obviously no one is using -ftrapv, so it will expose lots of bugs I would guess. Clearly the status quo is entirely unacceptable, so what's your recommendation of how we generate at least vaguely respectable code for overflow checking
Re: Ada front-end depends on signed overflow
* Andrew Pinski: The Ada front-end is still being missed compiled by VRP but VRP is doing the correct thing as the type is signed and overflow on signed is undefined (-fwrapv is not turned on by default for Ada). It probably makes sense to turn on -fwrapv for Ada because even without -gnato, the behavior is not really undefined: | The reason that we distinguish overflow checking from other kinds of | range constraint checking is that a failure of an overflow check can | generate an incorrect value, but cannot cause erroneous behavior. http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gnat_ugn_unw/Run_002dTime-Checks.html (Without -fwrapv, integer overflow is undefined, and subsequent range checks can be optimized away, so that it might cause erroneous behavior.)
Re: Ada front-end depends on signed overflow
* Paul Schlie: (Without -fwrapv, integer overflow is undefined, and subsequent range checks can be optimized away, so that it might cause erroneous behavior.) - Since for all practical purposes most (if not all) target's use 2's complement integer representations which naturally wrap, might it be simply best to presume that all do wrap by default, but allow -fnowrapv to disable it if ever required by the odd target/language? Enabling -fwrapv disables quite a few optimizations on signed integer types in C code. OTOH, you should compile most real-world C code with -fwrapv anyway. See my security advisory on incorrect overflow checking in C; this is a rather widespread issue, even in new code.
Re: Ada front-end depends on signed overflow
* Paul Schlie: (Without -fwrapv, integer overflow is undefined, and subsequent range checks can be optimized away, so that it might cause erroneous behavior.) - Since for all practical purposes most (if not all) target's use 2's complement integer representations which naturally wrap, might it be simply best to presume that all do wrap by default, but allow -fnowrapv to disable it if ever required by the odd target/language? Enabling -fwrapv disables quite a few optimizations on signed integer types in C code. OTOH, you should compile most real-world C code with -fwrapv anyway. See my security advisory on incorrect overflow checking in C; this is a rather widespread issue, even in new code. No they should be using -ftrapv instead which traps on overflow and then make sure they are not trapping when testing. Thanks, Andrew Pinski
Re: Ada front-end depends on signed overflow
* Paul Schlie: No they should be using -ftrapv instead which traps on overflow and then make sure they are not trapping when testing. - why? what language or who's code/target ever expects such a behavior? I think Andrew wants programmers to fix their code, instead of papering over problems. 8-) All code has seen wide testing essentially with -fwrapv enabled because in previous GCC version, -fwrapv had only a limited effect, especially across multiple statmeents. That's why I don't prefer the -ftrapv approach, even though its technically the correct one. It's a real pity that we have to trust so much C code which has been written and reviewed by developers who aren't aware that signed integer overflow is undefined.
Re: Ada front-end depends on signed overflow
On Fri, Jun 03, 2005 at 11:43:32AM -0400, Andrew Pinski wrote: Everyone's who writes C/C++ should know that overflow of signed is undefined. In practice, however, this issue is commonly ignored, because people code in a hurry, then test the behavior of the executable code, and if on all platforms tested there are overflows, but the overflows wrap and as a result the expected thing happens, the problem will not be noticed. I'm sure there are plenty of production codes that assume signed integer overflow wraps, or at least make the weaker assumption that in a = b + c + d; where all variables are integers, if one of the intermediate terms can't be represented in an integer, we still get the correct result if the final result is representable in an integer.
Re: Ada front-end depends on signed overflow
Everyone's who writes C/C++ should know that overflow of signed is undefined. Now in Java it is defined, which is the reason why -fwrapv exists in the place since GCC has a Java compiler. Right. I also believe that it is conceptually wrong to enable this for Ada, so until this issue is analyzed by an Ada knowlegeable person (Richard Kenner is looking at this issue), no such change should be made for Ada, since this would likely be an incorrect fix, masking the real underlying issue. Arno
Re: Ada front-end depends on signed overflow
From: Joe Buck [EMAIL PROTECTED] On Fri, Jun 03, 2005 at 11:43:32AM -0400, Andrew Pinski wrote: Everyone's who writes C/C++ should know that overflow of signed is undefined. In practice, however, this issue is commonly ignored, because people code in a hurry, then test the behavior of the executable code, and if on all platforms tested there are overflows, but the overflows wrap and as a result the expected thing happens, the problem will not be noticed. I'm sure there are plenty of production codes that assume signed integer overflow wraps, or at least make the weaker assumption that in a = b + c + d; where all variables are integers, if one of the intermediate terms can't be represented in an integer, we still get the correct result if the final result is representable in an integer. Agreed: - which why it should be assumed to wrap, especially given that most/all current target, and likely most/all future targets do wrap (not trap) signed overflows. (so would expect the compiler to ideally reflect reality when analyzing code for optimization, not assume that undefined means ignore reality)?
Re: Ada front-end depends on signed overflow
On Fri, 2005-06-03 at 13:03 -0400, Paul Schlie wrote: From: Joe Buck [EMAIL PROTECTED] On Fri, Jun 03, 2005 at 11:43:32AM -0400, Andrew Pinski wrote: Everyone's who writes C/C++ should know that overflow of signed is undefined. In practice, however, this issue is commonly ignored, because people code in a hurry, then test the behavior of the executable code, and if on all platforms tested there are overflows, but the overflows wrap and as a result the expected thing happens, the problem will not be noticed. I'm sure there are plenty of production codes that assume signed integer overflow wraps, or at least make the weaker assumption that in a = b + c + d; where all variables are integers, if one of the intermediate terms can't be represented in an integer, we still get the correct result if the final result is representable in an integer. Agreed: - which why it should be assumed to wrap, especially given that most/all current target, and likely most/all future targets do wrap (not trap) signed overflows. (so would expect the compiler to ideally reflect reality when analyzing code for optimization, not assume that undefined means ignore reality)? In every case where practical, we build compilers to do what standards say we should do, not what people do to abuse those standards. There are places where the standard is murky, and we make hard decisions based on what people do in the field. This is not one of them.
Re: Ada front-end depends on signed overflow
On Jun 3, 2005, at 09:02, Florian Weimer wrote: It probably makes sense to turn on -fwrapv for Ada because even without -gnato, the behavior is not really undefined: | The reason that we distinguish overflow checking from other kinds of | range constraint checking is that a failure of an overflow check can | generate an incorrect value, but cannot cause erroneous behavior. http://gcc.gnu.org/onlinedocs/gcc-4.0.0/gnat_ugn_unw/Run_002dTime- Checks.html (Without -fwrapv, integer overflow is undefined, and subsequent range checks can be optimized away, so that it might cause erroneous behavior.) This is the strongest argument I have seen so far for defaulting to either -ftrapv or -fwrapv. Both the example Victor Skinner sent, and range checking in Ada are cases of reasonable code where reasoning from undefined behavior breaks code in unexpected ways. Essentially a class of programs that checks for errors and would not otherwise be able to cause a SIGSEGV or similar with previous versions of GCC, would now be able to cause arbitrary mayhem. Integer expressions such as X + Y - Z will give the correct mathematical result, as long as that result is a representable integer. Without wrap-around semantics however, one would have to prove no intermediate result may overflow. Also, addition and subtraction are no longer associative. So, from a quality-of-implementation point of view, I believe we should always default to -fwrapv. For Ada, I propose we make the following changes: - by default, enable overflow checks using -ftrapv (jay, we should be able to get rid of -gnato finally!) - with checks suppressed, use -fwrapv.
Re: Ada front-end depends on signed overflow
Geert Bosch wrote: For Ada, I propose we make the following changes: - by default, enable overflow checks using -ftrapv This won't work, we generate plenty of cases of operations where we definitely don't want a check, and we don't set the flag, but currently the flag is ignored by gcc/gigi, we would have to adjust gigi to switch to unsigned for cases where the check overflow flagwas not set (jay, we should be able to get rid of -gnato finally!) - with checks suppressed, use -fwrapv. We really need to do this on an operation by operation basis, since in Ada checks can be suppressed on an operation by operation basis!
Re: Ada front-end depends on signed overflow
For Ada, I propose we make the following changes: - by default, enable overflow checks using -ftrapv -ftrapv is not practically usable because (1) it generates awful code and (2) it badly interacts with the RTL optimizers. -- Eric Botcazou