On Tue, Jan 14, 2014 at 10:37 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Tue, Jan 14, 2014 at 6:09 PM, Jakub Jelinek <ja...@redhat.com> wrote: > >>> On a second thought, the crossing of 16-byte boundaries is mentioned >>> for the data *access* (the instruction itself) if it is not naturally >>> aligned (please see example 3-40 and fig 3-2), which is *NOT* in our >>> case. >>> >>> So, we don't have to align 32 byte structures in any way for newer >>> processors, since this optimization applies to 64+ byte (larger or >>> equal to cache line size) structures only. Older processors are >>> handled correctly, modulo nocona, where its cache line size value has >>> to be corrected. >>> >>> Following that, my original patch implements this optimization in the >>> correct way. >> >> Sorry for catching this late, but on the 4.8 and earlier branches >> there is no opt argument and thus any ix86_data_alignment change is >> unfortunately an ABI change. So I'd think we should revert >> r206433 and r206436. And for the trunk we need to ensure even for > > OK, let's play safe. I'll revert these two changes (modulo size of > nocona prefetch block). > >> opt we never return a smaller number from ix86_data_alignment than >> we did in 4.8 and earlier, because otherwise if you have 4.8 compiled >> code that assumes the alignment 4.8 would use for something that is defined >> in a compilation unit built by gcc 4.9+, if we don't align it at least >> as much as we did in the past, the linked mix of 4.8 user and 4.9 definition >> could misbehave. > > From 4.9 onwards, we would like to align >= 64byte structures on > 64byte boundary. Should we add a compatibility rule to align >= 32byte > structures to 32 bytes?
That is why we issue a warning when alignment was changed with AVX support: [hjl@gnu-6 tmp]$ cat a1.i typedef long long __m256i __attribute__ ((__vector_size__ (32), __may_alias__)); extern __m256i y; void f1(__m256i x) { y = x; } [hjl@gnu-6 tmp]$ gcc -S a1.i a1.i: In function ‘f1’: a1.i:4:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6 f1(__m256i x) ^ a1.i:4:1: warning: AVX vector argument without AVX enabled changes the ABI [enabled by default] [hjl@gnu-6 tmp]$ > Please also note that in 4.7 and 4.8, we have > > int max_align = optimize_size ? BITS_PER_WORD : MIN (256, > MAX_OFILE_ALIGNMENT); > > so, in effect -Os code will be incompatible with other optimization levels. > > I guess that for 4.7 and 4.8, we should revert to this anyway, but > what to do with 4.9? > > Uros. -- H.J.