Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread Peter Bigot
Changes to mspgcc are/were driven by tickets filed on the SF bug tracker.
If you'd like this change made after reading the material below, please
file a ticket there.  mspgcc evolution/maintenance is not funded at this
time and the issue is below the threshold that I consider critical enough
to donate time, so the ticket will remain open until somebody takes over
such maintenance.

For the record, I've reconstructed my reasoning from two years ago, which
makes explicit the point Przemek made.  It's not worth the research to
prove it, but again this was justified by similar practices in other gcc
back-ends, even if not x86.

First, the value of a shift expression should be the same whether it is
computed at compile time or at runtime.  I.e., 1  16 should produce the
same value as 1  v when v has the value 16.

Second, the base MSP430 ISA does not have a multi-position shift operation
like x86 does.  It can shift only one bit position at a time.  Variable
shifts must be translated into loops with the iteration count provided at
runtime.  (Second-prime: use of MSP430X which has a limited version of such
instructions should also not result in a change of the value of the shift
expression.)

Third, it is unreasonable when v has the value 63532 to stall the processor
for 65532 iterations of a loop calculating 1  v.  Instead the runtime
code should limit the iteration count.  Since the maximum number of
iterations before the expression value becomes a constant (i.e., the number
of bits in the expression value) will always be a known power of two,
masking the iteration count to preserve only the low bits from 0 to that
number is the simplest solution.

This is the behavior mspgcc currently uses.  To accommodate your request a
test would have to be added at runtime to see whether the iteration count
is equal to or greater than the number of bits in the expression value, and
if so to substitute the constant (0 or ~0) that would have resulted from
the shift operation.  This introduces code bloat, as in the vast majority
of cases people will not be using a shift count that is large enough to
trigger the conditional.

In my opinion increasing generated code size just to accommodate a practice
that is explicitly undefined behavior is a poor use of resources and
improperly encourages a misunderstanding of how C treats this situation.
Having considered the arguments, no, the behavior will not be changed.

Please note that the new back end under development by Red Hat may have
different behavior.

Peter


On Sun, Apr 21, 2013 at 6:10 PM, Paul Sokolovsky pmis...@gmail.com wrote:

 Hello,

 On Sun, 21 Apr 2013 17:06:51 -0500
 Peter Bigot big...@acm.org wrote:

  This decision was intentional, as documented in
  https://sourceforge.net/p/mspgcc/bugs/118/.  My recollection is that
  the choice of how to make things consistent was informed by similar
  behavior in the contemporaneous gcc for x86 or at least one other
  target architecture.

 Thanks for the reference. So, I tested it with x86 gcc 4.4, 4.5, 4.6,
 4.7 (packages as shipped by Ubuntu), all of them produce mathematically
 expected result. msp430-gcc 4.5.3, 4.7.0 both produce unexpected result.

  As you say, the behavior is undefined.  Anybody who expects any
  particular behavior for this situation is confused about how C works
  and should take the issue up with JTC1/SC22/WG14.

 That's why I wrote so many words in the original mail. Yes, in small
 world of JTC1/SC22/WG14 C standard the behavior is undefined. But in
 much bigger world of: mathematics, well-known gcc targets, user
 expectations, principle of least surprise, etc. - it's all pretty well
 defined. So, when something is undefined in small area, it's still
 a good idea to have affinity towards how it's done/expected in wider
 areas. So, the request is just that - please kindly consider changing
 that behavior ;-).

 
  Peter
 

 --
 Best regards,
  Paul  mailto:pmis...@gmail.com

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter___
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users


Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread Grant Edwards
On 2013-04-21, Paul Sokolovsky pmis...@gmail.com wrote:

 It's all nice and good. But there's difference between undefined,
 any value and weird.

No, there isn't.  Undefined means _exactly_ that: you might get any
value -- no matter how wierd you think it.

 Because too many bits shifts may be undefined in C standard, but
 shifts by arbitrary number of bits are very well defined in
 arithmetic - and by very definition of (unsigned) shift, any value
 shifted by more bits than available in its representation is 0.

That's not what the C standard says.

 That's logical, that's what users know, that's what they expect from
 compiler,

They're wrong to expect that.  If the standard says the result is
undefined, then expecting anything in particular is wrong.

 So, msp430-gcc just masks out higher bits of shift count, and in this
 case leaves original value intact. Which turns term ((1  BITS) -
 1), which is common to do BITS-modular arithmetic, and would be
 expected to just optimize out in case of a full type, into an
 expression killer with infinite loops, etc. ensuing.

If you write incorrect code, you oughtn't be surprised when it behaves
incorrectly.

-- 
Grant Edwards   grant.b.edwardsYow! Will this never-ending
  at   series of PLEASURABLE
  gmail.comEVENTS never cease?


--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users


Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread Grant Edwards
On 2013-04-21, Paul Sokolovsky pmis...@gmail.com wrote:
 On Sun, 21 Apr 2013 17:06:51 -0500 Peter Bigot big...@acm.org wrote:

 This decision was intentional, as documented in
 https://sourceforge.net/p/mspgcc/bugs/118/.  My recollection is that
 the choice of how to make things consistent was informed by similar
 behavior in the contemporaneous gcc for x86 or at least one other
 target architecture.

 Thanks for the reference. So, I tested it with x86 gcc 4.4, 4.5, 4.6,
 4.7 (packages as shipped by Ubuntu), all of them produce
 mathematically expected result.

That's irrelevent.

 msp430-gcc 4.5.3, 4.7.0 both produce unexpected result.

That's per the standard.  If you write code that depends on some
particular behavior for somethign that's undefined, then that code
is wrong.

-- 
Grant Edwards   grant.b.edwardsYow! I'm wearing PAMPERS!!
  at   
  gmail.com


--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users


Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread Paul Sokolovsky
Hello,

On Sun, 21 Apr 2013 20:15:45 -0400
Przemek Klosowski przemek.klosow...@gmail.com wrote:

 As you say, the reason for the undefined behaviour in  the standard is
 because different ISA (instruction set architectures) behave
 differently. The underlying assumption here is that C is a 'bare
 metal' language, and implements high level operations that map well
 to the machine language facilities.
 
 As a result, you should not expect any commonality in behaviour
 between different ISA if there isn't commonality between their
 respective natural binary operations. If the overall fastest
 generated code for a standard-defined case is to mask out the shift
 count, it would be against the spirit of C to do anything else.

I'm sorry if my original mail didn't make it too clear, but my whole
report was regarding evaluation rules used during compile-time constant
subexpression elimination. It's obvious that runtime behavior should be
the most optimal, leaving corner cases to programmer. But what would be
the most obvious behavior for compile-time? Feel free to argue that it's
not the mathematical definition, but some obscure platform-dependent
factors.
 
 This reminds me of the section of GCC manual that said something like
 'because the behaviour in this case is undefined and left to the
 implementer, our choice is to launch a game of life'.

Yep, then people started to rely on it, then someone compiled stuff for
MSP430, saw no game of life launched, and reported the compiler as
broken. Solution?
http://en.wikipedia.org/wiki/Principle_of_least_astonishment

[]

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users


Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread David Brown
Hi,

When I compile the code you gave below (using gcc 4.6.3 from 20120406),
I get:

warning: left shift count = width of type [enabled by default]

That is even without any sort of warning flags - and you should normally
enable lots of warnings.

So here you have undefined code, and the compiler tells you of the
problem.  You really cannot get better than that by trying to invent
your own ideas about how you want to define this undefined behaviour.

David


On 21/04/13 21:35, Paul Sokolovsky wrote:
 Hello,
 
 It's a known fact that shifting by more or equal bits as an integer
 type contains is undefined behavior per C standard,
 http://stackoverflow.com/questions/7214263/unexpected-behavior-of-bitwise-shifting-using-gcc
 has relevant quotes and references.
 
 It's less known the rules of dealing with undefined values for
 compiler writers, in this respect I find LLVM's description insightful:
 http://llvm.org/docs/LangRef.html#undefined-values . Summing up, it
 says that result of most of the expressions involving undefined value
 is also undefined.
 
 Finally, it's known why C standard has it like that: because rules
 various CPUs use for such shifts at *runtime* vary, so *runtime* code
 should not rely on them. C also doesn't define separation between
 compile-time and run-time evaluation strictly, so for it any shift by
 too many bits is undefined.
 
 It's all nice and good. But there's difference between undefined,
 any value and weird. Because too many bits shifts may be
 undefined in C standard, but shifts by arbitrary number of bits are
 very well defined in arithmetic - and by very definition of (unsigned)
 shift, any value shifted by more bits than available in its
 representation is 0. That's logical, that's what users know, that's
 what they expect from compiler, and well, that's how x86-gcc behaves.
 
 Now mspgcc:
 
 int v = 5;
 int main()
 {
 printf(%d\n, v + (1  16));
 }
 
 main:
 mov r1, r4
 add #2, r4
 mov v, r15
 add #1, r15
 pushr15
 push#.LC0
 call#printf
 add #4, r1
 
 
 So, msp430-gcc just masks out higher bits of shift count, and in this
 case leaves original value intact. Which turns term ((1  BITS) - 1),
 which is common to do BITS-modular arithmetic, and would be expected to
 just optimize out in case of a full type, into an expression killer with
 infinite loops, etc. ensuing.
 
 


--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users


Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread Paul Sokolovsky
Hello,

On Mon, 22 Apr 2013 08:12:50 -0500
Peter Bigot big...@acm.org wrote:

 Changes to mspgcc are/were driven by tickets filed on the SF bug
 tracker. If you'd like this change made after reading the material
 below, please file a ticket there.  mspgcc evolution/maintenance is
 not funded at this time and the issue is below the threshold that I
 consider critical enough to donate time, so the ticket will remain
 open until somebody takes over such maintenance.

Submitted: https://sourceforge.net/p/mspgcc/bugs/351/ . I actually
didn't want to spend too much time on this, or call for immediate
actions, just point out that there're better choices for
implementation-defined behavior as implemented mspgcc. I personally
don't expect any immediate changes, the whole idea was to save novices
or folks who may port big software from unneeded headache.

 For the record, I've reconstructed my reasoning from two years ago,
 which makes explicit the point Przemek made.  It's not worth the
 research to prove it, but again this was justified by similar
 practices in other gcc back-ends, even if not x86.
 
 First, the value of a shift expression should be the same whether it
 is computed at compile time or at runtime.  

Well, it's the same thing - standard doesn't require this ;-). I
understand benefit of consistency, but given the choices (both
sanctioned by standard, as it simply underfines the case completely):

1. Make runtime  compile-time be consistent.
2. Make compile-time be consistent with mathematical definition, and
leave runtime to be optimal for particular h/w.

- It hopefully easier to make a choice. If not, then good exercise may
be to stop thinking about particular embedded CPU and all quirks which
can be done in its realm, but think about a compiler in general, and
that metaprogramming is important thing for producing optimal code
(especially for embedded targets!), and metapgramming requires robust
compile-time evaluation.

 I.e., 1  16 should
 produce the same value as 1  v when v has the value 16.
 
 Second, the base MSP430 ISA does not have a multi-position shift
 operation like x86 does.  It can shift only one bit position at a
 time.

Exactly, so masking out bits in shift counter wasn't even
hardware-bound (I don't know much about MSP430X), but more or less
arbitrary choice.

 Variable shifts must be translated into loops with the
 iteration count provided at runtime.  (Second-prime: use of MSP430X
 which has a limited version of such instructions should also not
 result in a change of the value of the shift expression.)

Oops, here you appear to want even more than I: I want just
math-guided rules for builtin GCC calculator, you want to generate
same-behaving code for different ISAs, at the potential expense of
runtime-efficiency. Hmm...

 Third, it is unreasonable when v has the value 63532 to stall the
 processor for 65532 iterations of a loop calculating 1  v.  Instead
 the runtime code should limit the iteration count.  Since the maximum
 number of iterations before the expression value becomes a constant
 (i.e., the number of bits in the expression value) will always be a
 known power of two, masking the iteration count to preserve only the
 low bits from 0 to that number is the simplest solution.

Whoa, so you generate non-optimal code for the slight possibility that
someone shifts stuff 63532 times? That clearly goes against what
Przemek wrote (and with what I wholly agree, as my suggestions are
solely for compile-time behavior). If a programmer wants to shoot
oneself in the feet, don't preclude one from doing that, just give a
good gun which will shoot in the feet, not suddenly in the opposite
direction ;-).

 This is the behavior mspgcc currently uses.  To accommodate your
 request a test would have to be added at runtime to see whether the
 iteration count is equal to or greater than the number of bits in the
 expression value, and if so to substitute the constant (0 or ~0) that
 would have resulted from the shift operation.  This introduces code
 bloat, as in the vast majority of cases people will not be using a
 shift count that is large enough to trigger the conditional.
 
 In my opinion increasing generated code size just to accommodate a
 practice that is explicitly undefined behavior is a poor use of
 resources and improperly encourages a misunderstanding of how C
 treats this situation. Having considered the arguments, no, the
 behavior will not be changed.

So once again, I was talking about changing *only* compile-time time
behavior, runtime doesn't need to change (except that, as a separate
issue, I'd vote for removing any extra masking during runtime shifts -
programmer either took care of that by masking count oneself, knows
that it's invariantly in range, or well, wants to shift 60K times).

 Please note that the new back end under development by Red Hat may
 have different behavior.
 
 Peter
 
 
 On Sun, Apr 21, 2013 at 6:10 PM, Paul Sokolovsky pmis...@gmail.com
 wrote:
 

Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread Paul Sokolovsky
Hello,

On Mon, 22 Apr 2013 17:39:27 +0200
David Brown da...@westcontrol.com wrote:

 Hi,
 
 When I compile the code you gave below (using gcc 4.6.3 from
 20120406), I get:
 
 warning: left shift count = width of type [enabled by default]

 That is even without any sort of warning flags - and you should
 normally enable lots of warnings.
 
 So here you have undefined code, and the compiler tells you of the
 problem.  You really cannot get better than that by trying to invent
 your own ideas about how you want to define this undefined behaviour.

Yeah, but can get better by leveraging generally accepted ideas (math
def of shift) and ideas put by more than just you into similar cases
(other well known gcc ports, like x86) ;-).

Thanks for all the flame, guys! ;-).


-- 
Best regards,
 Paul  mailto:pmis...@gmail.com

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Mspgcc-users mailing list
Mspgcc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mspgcc-users


Re: [Mspgcc-users] mspgcc undefines too-many-bits shifts in weird way

2013-04-22 Thread David Brown
On 22/04/13 19:08, Paul Sokolovsky wrote:
 Hello,

 On Mon, 22 Apr 2013 08:12:50 -0500
 Peter Bigot big...@acm.org wrote:

 Changes to mspgcc are/were driven by tickets filed on the SF bug
 tracker. If you'd like this change made after reading the material
 below, please file a ticket there.  mspgcc evolution/maintenance is
 not funded at this time and the issue is below the threshold that I
 consider critical enough to donate time, so the ticket will remain
 open until somebody takes over such maintenance.

 Submitted: https://sourceforge.net/p/mspgcc/bugs/351/ .I actually
 didn't want to spend too much time on this, or call for immediate
 actions, just point out that there're better choices for
 implementation-defined behavior as implemented mspgcc. I personally
 don't expect any immediate changes, the whole idea was to save novices
 or folks who may port big software from unneeded headache.

There is only one good way to save novices or porters from accidents 
here - that is to give a warning that (1  16) is undefined on a 16-bit 
architecture.  And that is what gcc does here.

Other than that, /any/ reasonable choice will do - as there is no 
defined correct choice.  The principle of least surprise is of 
course important - so the warning here is excellent to let the user 
avoid a surprise.


 For the record, I've reconstructed my reasoning from two years ago,
 which makes explicit the point Przemek made.  It's not worth the
 research to prove it, but again this was justified by similar
 practices in other gcc back-ends, even if not x86.

 First, the value of a shift expression should be the same whether it
 is computed at compile time or at runtime.

 Well, it's the same thing - standard doesn't require this ;-). I
 understand benefit of consistency, but given the choices (both
 sanctioned by standard, as it simply underfines the case completely):

 1. Make runtime  compile-time be consistent.
 2. Make compile-time be consistent with mathematical definition, and
 leave runtime to be optimal for particular h/w.

Numbers in a programming language are not the same as simple 
mathematical integers.  There are too many differences for there to be 
much point in trying to be slightly more consistent in one area.

Note that this example shows one of the differences between real 
integers and C integers - in real mathematics, left-shifting an unsigned 
integer is the same as multiplying it by 2.  In C, that is not the case. 
  With 16-bit ints, (1  16) can be anything, but happens to evaluate 
to 1 in this case.  But (1 * 65536) will evaluate to 65536 (as 65536 
automatically parses as a long), which will be truncated to 0 on 
conversion to 16-bit.


On the other hand, it is /very/ important that compile-time calculations 
match run-time calculations - it means that calculations can be written 
in run-time form but calculated at compile time with the same results. 
Thus gcc does compile-time calculations with the same integer models as 
the run-time model.


 - It hopefully easier to make a choice. If not, then good exercise may
 be to stop thinking about particular embedded CPU and all quirks which
 can be done in its realm, but think about a compiler in general, and
 that metaprogramming is important thing for producing optimal code
 (especially for embedded targets!), and metapgramming requires robust
 compile-time evaluation.

Metaprogramming requires that compile-time evaluation is consistent with 
run-time evaluation.  And run-time evaluation requires that code be 
correct according to the laws of C, and as small and fast as practically 
possible within that.


You seem to be under the belief that the behaviour of msp430 gcc is 
different from other compilers, such as gcc for the x86, and that this 
is a quirk of the msp430 architecture.  It is not.  Different 
processors and different compilers can have different rules, but the 
mask out the extra bits method is the more common.  If you left-shift 
an int on x86 gcc by 32, you will get the same effect.  This also 
applies on the PPC.  But with 16-bit ints on an 8086, (1  16) 
evaluates to 0.  In each case, the choice for the undefined behaviour is 
the one that best matches the underlying cpu.

mvh.,

David




 I.e., 1  16 should
 produce the same value as 1  v when v has the value 16.

 Second, the base MSP430 ISA does not have a multi-position shift
 operation like x86 does.  It can shift only one bit position at a
 time.

 Exactly, so masking out bits in shift counter wasn't even
 hardware-bound (I don't know much about MSP430X), but more or less
 arbitrary choice.

 Variable shifts must be translated into loops with the
 iteration count provided at runtime.  (Second-prime: use of MSP430X
 which has a limited version of such instructions should also not
 result in a change of the value of the shift expression.)

 Oops, here you appear to want even more than I: I want just
 math-guided rules for builtin GCC calculator, you want to generate