[c-prog] Re: integer promotions

peternilsson42 Sun, 23 Nov 2008 22:07:51 -0800

"Pedro Izecksohn" <[EMAIL PROTECTED]> wrote:
>
> For peternilsson42:
> 
> I was not clear and you did not read message 68798.
> 
> http://tech.groups.yahoo.com/group/c-prog/message/68798


That post contained detail missing from your original post.
However, it did not contain detail which explained why you
where confused.

> To clarify:
> 
> That code I compiled on two independent compilers. On
> both compilers:
> 
> USHRT_MAX is 0xffff
> UINT_MAX is 0xffffffff
> ULLONG_MAX is 0xffffffffffffffff
> 
> > For integers with a 'rank' less than int, if the
> > range of the integer type fits into the range of
> > int, then that type will promote to int, if it
> > is promoted. If the range won't fit into an int,
> > it will be promoted to an unsigned int.
> 
> It shows that you read the standard.

It also explains the results you received.

> > Consider the test: (-1 < (unsigned short) 1)
> > 
> > Whilst intuitively it should always be true, there
> > are many implementations where it is false!
> 
> The unsigned short value 1 fits into an int and -1
> is an int, so, according to the standard this test
> must be true.

On your machine perhaps, but there are any number of
machines where USHRT_MAX > INT_MAX. [DSPs seem to be
more common than desktop PC's, but when I started C,
most high end systems did not bother with 16-bit types.
Everything was either 8-bit, or at least 32-bit.]

> > > > a = -1;
> 
> I changed this line in message 68798 to:
> 
> a = USHRT_MAX;

So you made absolutely _no_ change to the semantics of
that assignment!

> > Here's a table of what a will promote to, and the
> > type of a * a, dependant on the maximum values of
> > unsigned short, int and unsigned int:
> 
> I'm sorry I was not sufficiently clear.

Your post didn't need to be. I was trying to explain
the language, but using three examples as illustration.
You seem to be only interested in one class of machine.
No matter... ;-)

> > Unsigned integers do not get sign-extended when
> > promoted to wider unsigned integer types. Why?
> > Because there is no sign to extend!
> 
> This is exactly how I thought before I saw two
> independent compilers producing the binaries that
> result that result.

I don't understand that sentence. No conforming
implementation can exhibit sign extension for an
unsigned integer type since no unsigned integer
type has a sign bit.

> > First, you believe promotion works from the
> > outside in. In other words, the type of the
> > variable you're assigning to will influence
> > the types and promotion of the expression on
> > the right hand side. They won't. It works from
> > the inside out.
> 
> No. I used unsigned short int and unsigned long
> long int because the original calculation needs
> them, because the result of the original
> calculation may not fit inside an unsigned int.

That's fine, but it doesn't explain why you found
the results confusing. Misunderstanding promotion
explains it perfectly. :-)

> The original calculation is more complex than a
> multiplication of two integers.

Well, if you're ready, you might like to post the
more complicated example.

> > Secondly, you've probably been confused by
> > 'hex' output
> 
> No. I used hexadecimal representation because
> these extreme values are best represented in
> hexadecimal.

I didn't mean you couldn't read hex, I meant that
many people see hex, they have a conditioned response
to believe that the hex value _is_ the representation,
as well as, the value.

> I hope that you answer my question, that is explicit
> in my other message.

My previous message explains precisely why you received
the output you received. But instead of explaining it
in terms of one specific machine, I tried to explain
the language itself, showing three possible implementations,
one of which appears to match the machine you're using.

> My personal opinion is that the two independent compilers
> I used are buggy,

I've actually demonstrated why they're not.

> but I want another opinion before reporting this bug
> to two independent developers.

You would look very foolish if you reported this as a
bug.

But, for the sake of trying to make my post a little
clearer, let's suppose we have an implementation that
you describe above...

  USHRT_MAX is 0xffff
  UINT_MAX is 0xffffffff
  ULLONG_MAX is 0xffffffffffffffff

This isn't enough to fully explain what's going on so
let me add the reasonable assumption:

  INT_MAX is 0x7FFFFFFF

Now, let's start...

  unsigned short int a;
  unsigned long long int b, c;

All good. [BTW, many people tend not to bother with 'int'
when using unsigned types.]

  a = USHRT_MAX;

This will assign a the value 0xFFFF. So will a = -1 on
your machine as (USHRT_MAX plus 1) plus (minus 1) is
USHRT_MAX.

  b = (a*a);

Forget about "b =" and the parentheses. The first detail
is a * a. Now since a has lower rank than int, it is
subject to integral promotion in that sub-expression.
Since USHRT_MAX < INT_MAX, it promotes to int. Thus,
the statement (on this particular machine) is the same
as:

  b = (65535 * 65535);

Now 65535 * 65535 has the _mathematical_ value of
4294836225. That value is too large to be represented
as an int on your machine. The behaviour of your program
is undefined, so if this is what your real code actually
does, it should be rewritten. But let's continue.

On many machines, overflow of a two's complement value
will simply yield the result as if the truncated binary
representation of the result where placed directly into
an int object. So it's likely that the notional value
4294836225 will actually be -131071.[*]

Thus your statement becomes...

  b = (-131071);

Or more simply...

  b = -131071;

Now, when you assign a value to an unsigned integer, it
will be reduced modulo one more than the maximum value 
of that type. Hence, it will be reduced modulo (in hex)
0x10000000000000000 (i.e. ULLONG_MAX plus 1). Now, that
value is 18446744073709420545 (or 0xFFFFFFFFFFFE0001.)

That explains what you see when you print b.

So we move on to...

  c = ((unsigned int)a*(unsigned int)a);

What you have is...

  c = (A * A); /* where A is ((unsigned) a) */

You've directly converted the value of a to an unsigned
int. Since this preserves value, A also has the value
0xFFFF, only _now_ it is an unsigned int.

When you multiply A by itself, again you get the
mathematical result 4294836225. This result _is_ in the
range of unsigned int on your system, so the result
stays as 0xFFFE0001.

Now this value is assigned to c. But unsigned long long
is (indeed must be) wide enough to hold any value of an
unsigned int. So it's as if you had written...

  c = 0xFFFE0001;

It should now come as no surprise that this is the value
you see when you print c.

There is no reason to think that b's value should be
truncated to 0xFFFE0001, nor that c's value should be
extended to 0xFFFFFFFFFFFE0001.

So which do you think, and why do you think that? Or,
do you think different values should be displayed?

--
* It's as if you had written...
  unsigned tmp = 4294836225;
  (* (int *) &tmp) 

-- 
Peter

[c-prog] Re: integer promotions

Reply via email to