Hi!
This patch started with noticing while working on PR50596 that
#define N 1024
long long a[N];
char b[N];
void
foo (void)
{
int i;
for (i = 0; i N; i++)
b[i] = a[i];
}
is even with -O3 -mavx2 vectorized just with 16-byte vectors
instead of 32-byte vectors and has various fixes I've
On 10/12/2011 09:09 AM, Jakub Jelinek wrote:
/* Multiply the shuffle indicies by two. */
- emit_insn (gen_avx2_lshlv8si3 (t1, t1, const1_rtx));
+ if (maskmode == V8SImode)
+ emit_insn (gen_avx2_lshlv8si3 (t1, t1, const1_rtx));
+ else
+ emit_insn
On Wed, Oct 12, 2011 at 10:49:33AM -0700, Richard Henderson wrote:
I believe I've commented on everything else in the previous messages.
Here is an updated patch which should incorporate your comments from both
mails (thanks for them). Bootstrapped/regtested on x86_64-linux and
i686-linux, ok
On 10/12/2011 02:23 PM, Jakub Jelinek wrote:
2011-10-12 Jakub Jelinek ja...@redhat.com
* config/i386/i386.md (UNSPEC_VPERMDI): Remove.
* config/i386/i386.c (ix86_expand_vec_perm): Handle
V16QImode and V32QImode for TARGET_AVX2.
(MAX_VECT_LEN): Increase to 32.