[Bug c++/32016] sizeof(class) always a multiple of 4 on 32 bit machine

bliss1940-bbs at yahoo dot com Mon, 21 May 2007 14:19:09 -0700


------- Comment #2 from bliss1940-bbs at yahoo dot com  2007-05-21 22:18 -------
(In reply to comment #1)


I don't think I said GCC was in error, but just different.

Maybe we can come to an agreement here, or maybe not.  Let's see.

I certainly would expect the ARM7 would prefer that 4 byte operands (ints, in
this case) would be on addresses that are a multiple of 4, 2 byte operands
(shorts) would want to be at addresses that are a multiple of two, and 1 byte
operands (chars) have no alignment requirement so could go anywhere.

I would expect to see that somewhere in the ABI.  I would also expect to see
that in the ABI for the Intel Pentium.  I would not expect to see anything in
the ABI concerning structs as the hardware knows nothing about them.  They are
the figment of the imagination of the compiler and programmer.

So I would expect the Microsoft compiler targeting the PC and the GNU compiler
targeting the ARM7 would align shorts and ints accordingly.  Indeed, that's
what I see when I examine the code.

By the way, there is a good explanation of this on Wikipedia:  It mentions that
compilers usually arrange structs so the primitive objects they contain are at
appropriate addresses.  It explains why it's more efficient for the hardware,
and it also explains how the GNU and Microsoft compilers do it when targeting
the PC.  So far, so good.
http://en.wikipedia.org/wiki/Data_structure_alignment

But the GNU compiler does something else.  It makes the size of all structs a
multiple of 4.  I can think of one reason to do this.  GNU compiled code can
copy any struct by accessing memory 4 bytes at a time.  That is, it can
consider every struct to be composed of ints, and they will all be aligned
correctly.  If this is a good enough reason to make all structs a multiple of
4, then so be it.  But it's confusing to mention the ABI, because the ABI has
nothing to do with such things.  And it's confusing to mention data alignment
because it's an artificial requirement due to the way GNU copies structs.  It
has nothing to do with the struct data members as seen by the programmer.

Here are some examples of structs compiled by Microsoft for the PC, and  also
compiled by GNU for the ARM7 (Analog Devices ADuC7024).  These are the only
compilers I currently have.

                                    Microsoft                  GNU
      struct Char1   {             sizeof() == 1        sizeof() == 4
         char char1;
         };

      struct Char2   {             sizeof() == 2        sizeof() == 4
         char char1;
         char char2;
         };

      struct Char3   {             sizeof() == 3        sizeof() == 4
         char char1;
         char char2;
         char char3;
         };

      struct Char4   {             sizeof() == 4        sizeof() == 4
         char char1;
         char char2;
         char char3;
         char char4;
         };

      struct Char5   {             sizeof() == 5        sizeof() == 8
         char char1;
         char char2;         
         char char3;
         char char4;
         char char5;
         };

Note that the above structs only have chars so by definition they are aligned
correctly.  But Microsoft's structs are only as large as necessary,  GNU's are
padded to a multiple of 4 bytes.  This has nothing to do with the alignment of
the data members as they have no particular alignment requirement.  But as
mentioned before, GNU can copy any struct accessing memory 4 bytes at a time. 
There is no other reason I can think of to pad these structs to a multiple of 4
bytes.



Here are some structs that contain shorts (2 byte) to show what data alignment
means to me, to the wiki, to the hardware, and I think also to most people.

                                    Microsoft                  GNU
   class Short1   {
      char char1;                  sizeof() == 4        sizeof() == 4
      short short1;
      };

   class Short2   {                sizeof() == 6        sizeof() == 8
      char char1;
      short short1;
      short short2;
      };

Here a hole is inserted after the char.  This is required to align the 2 byte
short on the appropriate address.  Both Microsoft and GNU agree, as do most
everyone.  This is an ABI issue.  But notice the second class, Short2.  I added
a second short which will automatically have the correct alignment because it
immediately follows another short.  It doesn't require padding to be aligned
correctly.  Everyone agrees on this.  But the Microsoft compiler gives the size
as 6.  As always, the GNU compiler adds two pad bytes to make the struct have a
size which is a multiple of 4.  But the struct doesn't require this.  Neither
would a following struct in an array of structs, as long as the compiler didn't
try accessing these structs as if they contained ints.  The only alignment
requirement obvious to the programmer is that the shorts be on addresses that
are a multiple of two bytes.

To beat a dead horse again, the data members are suitably aligned in
Microsoft's struct.  For the GNU folk to say the extra padding is for data
alignment or is an ABI issue is misleading at best.  It's only an alignment/ABI
issue if the compiler makes it so by generating code that accesses the structs
as if they contained ints, probably for quick copying.  But if that's the case,
I wish they would say that.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32016

[Bug c++/32016] sizeof(class) always a multiple of 4 on 32 bit machine

Reply via email to