[Bug c/50521] -fstrict-volatile-bitfields is not strict

2011-10-29 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50521

--- Comment #14 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-29 10:45:53 UTC ---
(In reply to comment #13)

  See 7.1.7.5 second and third paragraph and the note just after.
 
 Is that means a statement
   a = b;
 always should be treat as if
   tmp = b;
   a = tmp;
 two individual statements?

That's my understanding of the text.

Further, given

struct {
   volatile int bits:32;
} a;
int result;

my understanding is that the note means that

  result = ++a.bits;

should be translated into

  int tmp = a.bits;
  a.bits = tmp + 1;
  result = a.bits;

and a++ into

  result = a.bits;
  tmp = a.bits;
  a.bits = tmp + 1;

suitable expanded for aligned container loads  stores on each access to
a.bits, with each access to the a container int handled as a volatile access.

Also, from in the second sentence of the note the load of the container on
write may not be optimized away even if it's entirely masked by the write
operation. I.e. 

a.bits = x;

translates into

  int tmp = *(int *)(int aligned address of a.bits);
  tmp = ~0x;
  tmp |= x;
  *(int *)(int aligned address of a.bits) = tmp;

which is the same load  store memory access sequence as used when a.bits is
not filling the entire container.

  int tmp = *(int *)(int aligned address of a.bits);
  tmp = ~a_bits_mask;
  tmp |= (x  shift)  ~a_bits_mask;
  *(int *)(int aligned address of a.bits) = tmp;

where it's not allowed to optimize away the initial load if the result of that
load is entirely masked away by the bit-field assignment (32 bit ~0x ==
0). Operations on tmp between the load  store of the a.bits container may be
optimized freely as tmp itself is not a volatile.

 I think STRICT_ALIGNMENT is not only for ARM, but also MIPS, SH and others.
 I'll create new ticket later about STRICT_ALIGNMENT.

agreed

none of this is only about ARM I think. But the ARM AAPCS specification is
suitable to use as reference for the implementation as it's very detailed on
how to operate on volatile bit-fields and also alignment requirements on
bit-field accesses in general. Not sure the others are as detailed, and it's
very likely the rules from the ARM specification can be applied there as well.


[Bug c/50521] -fstrict-volatile-bitfields is not strict

2011-10-28 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50521

--- Comment #9 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-28 07:32:48 UTC ---
C standard does not define any of this It's all implementation and platform ABI
dependent. 


The C standard does define not storage size of a bit-field other than that it's
sufficiently large, or bit-fields of other types than _Bool and int
(+qualifiers), or if bits outside the specific bit-field is accessed as side
effect when operating on a bit-field.


For example the ARM ABI defines volatile bitfield memory access in full detail
as being equal to the base type of the bitfield, and I see now that it actually
requires a double load in the mentioned test case. See
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042d/IHI0042D_aapcs.pdf
section 7.1.7.5.


[Bug c/50521] -fstrict-volatile-bitfields is not strict

2011-10-28 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50521

--- Comment #12 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-28 17:46:15 UTC ---
Regarding the double load. In a statement like a = b, both a  be should be
individually accessed even if they refer to the same storage. So  
bitfield.bits.a = bitfield.bits.c should load the bitfield variable twice, once
for reading the rvalue and once for masking the lvalue assignment.

See 7.1.7.5 second and third paragraph and the note just after.


Regarding STRICT_ALIGNMENT, not strictly needed on ARM i think. Smaller
accesses than the base type is acceptable, as long as it's aligned to the
matching access size (8/16/32/64 bit) and on ARMv7 unaligned access is allowed,
but at a performance penalty. And this change is technically unrelated to
strict-volatile-bitfields even if there is overlap.


[Bug c/48190] [regression?] Huge memory use while compiling qemu-0.4.0

2011-10-27 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48190

Henrik Nordström henrik at henriknordstrom dot net changed:

   What|Removed |Added

 CC||henrik at henriknordstrom
   ||dot net

--- Comment #2 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-27 23:10:20 UTC ---
Problem still exists in 4.6.2.

A patch for this was committed to trunk on 29 Mar 2011 / revision 171655.


[Bug c/50521] -fstrict-volatile-bitfields is not strict

2011-10-27 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50521

Henrik Nordström henrik at henriknordstrom dot net changed:

   What|Removed |Added

 CC||henrik at henriknordstrom
   ||dot net

--- Comment #5 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-27 23:23:13 UTC ---
Is this related to the strict volatile bitfields change in trunk revision
171347?
http://gcc.gnu.org/viewcvs/trunk/gcc/expr.c?view=logpathrev=171347

It's quite similar changes but at slightly different locations.


[Bug c/48190] [regression?] Huge memory use while compiling qemu-0.4.0

2011-10-27 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48190

--- Comment #3 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-27 23:26:05 UTC ---
Created attachment 25640
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25640
trunk change 171655 backported to 4.6.


[Bug c/50521] -fstrict-volatile-bitfields is not strict

2011-10-27 Thread henrik at henriknordstrom dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50521

--- Comment #7 from Henrik Nordström henrik at henriknordstrom dot net 
2011-10-28 01:59:34 UTC ---
Right.  r171347 seem to be about fetches from bitfields while this change is
about stores?

An interesting test would be 

  bitfield.bits.a = bitfield.bits.c

which should load the int to a register, load the int again to another
register, copy c to a between them and store the result.  I guess the double
load may be optimized away as it's an sideeffect of the aassignment.