On 06/08/2016 12:36 AM, Alexander Cherepanov wrote:
Hi!

If a variable of type _Bool contains something different from 0 and 1
its use amounts to UB in gcc and clang. There is a couple of examples in
[1] ([2] is also interesting).

[1] https://github.com/TrustInSoft/tis-interpreter/issues/39
[2] https://github.com/TrustInSoft/tis-interpreter/issues/100

But my question is about the following example:

----------------------------------------------------------------------
#include <stdio.h>

int main()
{
   _Bool b;
   *(char *)&b = 123;
   printf("%d\n", *(char *)&b);
}
----------------------------------------------------------------------

Results:

----------------------------------------------------------------------
$ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out
123

$ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out
1
----------------------------------------------------------------------

gcc version: gcc (GCC) 7.0.0 20160604 (experimental)

It seems that padding in _Bool is treated as permanently unspecified. Is
this behavior intentional? What's the theory behind it?

One possible explanations is C11, 6.2.6.2p1, which reads: "The values of
any padding bits are unspecified." But it's somewhat a stretch to
conclude from it that the values of padding bits cannot be specified
even with explicit assignment.

Another possible approach is to refer to Committee Response for Question
1 in DR 260 which reads: "Values may have any bit-pattern that validly
represents them and the implementation is free to move between alternate
representations (for example, it may normalize pointers, floating-point
representations etc.). [...] the actual bit-pattern may change without
direct action of the program."

There has been quite a bit of discussion among the committee on
this subject lately (the last part is the subject of DR #451,
though it's discussed in the context of uninitialized objects
with indeterminate values).  I would hesitate to call it
consensus but I think it would be fair to say that the opinion
of the vocal majority is that implementations aren't intended
to spontaneously change valid (i.e., determinate) representations
of objects in the absence of an access to the value of the object.
There are also two special cases that apply to the code example
above: accesses via an lvalue of a character type (which has no
padding bits and so no trap representation), and objects that
could not have been declared to have register storage because
their address is taken (DR #338).  Those should be expected
to have a stable representation/bit pattern from one read
to the next.

Martin

Reply via email to