[Bug tree-optimization/39075] alignment for unsigned short a[10000] vs extern unsigned short a[10000]
--- Comment #1 from dann at godzilla dot ics dot uci dot edu 2009-02-02 14:50 --- This code: unsigned short a[1]; void test() { int i; for (i = 0; i 1; ++i) a[i] = 5; } will be vectorized with -O3 -march=core2 to this: .L2: movdqa %xmm0, a(%eax) addl$16, %eax cmpl$2, %eax jne .L2 but this one: extern unsigned short a[1]; void test() { int i; for (i = 0; i 1; ++i) a[i] = 5; } will get a lot of extra code before the loop because the vectorizer thinks it needs to do peeling for alignment: test.c:7: note: Alignment of access forced using peeling. Intel's compiler does not generate the extra peeling code. -- dann at godzilla dot ics dot uci dot edu changed: What|Removed |Added Summary|alignment for unsigned |alignment for unsigned |short a[1 |short a[1] vs extern ||unsigned short a[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39075
[Bug tree-optimization/39075] alignment for unsigned short a[10000] vs extern unsigned short a[10000]
--- Comment #2 from rguenth at gcc dot gnu dot org 2009-02-02 14:53 --- The ABI does not guarantee alignment bigger than 2 for the external array. The vectorizer adjusts the alignment for the internal one. -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39075
[Bug tree-optimization/39075] alignment for unsigned short a[10000] vs extern unsigned short a[10000]
--- Comment #3 from rguenth at gcc dot gnu dot org 2009-02-02 14:55 --- Err, it seems at least the x86_64 ABI guarantees alignment of 16 bytes for arrays bigger than 16 bytes (including variable length arrays). -- rguenth at gcc dot gnu dot org changed: What|Removed |Added Severity|normal |enhancement Status|RESOLVED|UNCONFIRMED Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39075