On Sun, Mar 27, 2011 at 3:44 PM, H.J. Lu <hjl.to...@gmail.com> wrote:

> Here is a patch to split AVX 32byte unalignd load/store:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00743.html
>
> It speeds up some SPEC CPU 2006 benchmarks by up to 6%.
> OK for trunk?

> 2011-02-11  H.J. Lu  <hongjiu...@intel.com>
>
>       * config/i386/i386.c (flag_opts): Add -mavx256-split-unaligned-load
>       and -mavx256-split-unaligned-store.
>       (ix86_option_override_internal): Split 32-byte AVX unaligned
>       load/store by default.
>       (ix86_avx256_split_vector_move_misalign): New.
>       (ix86_expand_vector_move_misalign): Use it.
>
>       * config/i386/i386.opt: Add -mavx256-split-unaligned-load and
>       -mavx256-split-unaligned-store.
>
>       * config/i386/sse.md (*avx_mov<mode>_internal): Verify unaligned
>       256bit load/store.  Generate unaligned store on misaligned memory
>       operand.
>       (*avx_movu<ssemodesuffix><avxmodesuffix>): Verify unaligned
>       256bit load/store.
>       (*avx_movdqu<avxmodesuffix>): Likewise.
>
>       * doc/invoke.texi: Document -mavx256-split-unaligned-load and
>       -mavx256-split-unaligned-store.
>
> gcc/testsuite/
>
> 2011-02-11  H.J. Lu  <hongjiu...@intel.com>
>
>       * gcc.target/i386/avx256-unaligned-load-1.c: New.
>       * gcc.target/i386/avx256-unaligned-load-2.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-load-3.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-load-4.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-load-5.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-load-6.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-load-7.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-2.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-5.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-6.c: Likewise.
>       * gcc.target/i386/avx256-unaligned-store-7.c: Likewise.
>



> @@ -203,19 +203,37 @@
>        return standard_sse_constant_opcode (insn, operands[1]);
>      case 1:
>      case 2:
> +      if (GET_MODE_ALIGNMENT (<MODE>mode) == 256
> +       && ((TARGET_AVX256_SPLIT_UNALIGNED_STORE
> +            && MEM_P (operands[0])
> +            && MEM_ALIGN (operands[0]) < 256)
> +           || (TARGET_AVX256_SPLIT_UNALIGNED_LOAD
> +               && MEM_P (operands[1])
> +               && MEM_ALIGN (operands[1]) < 256)))
> +     gcc_unreachable ();

Please use "misaligned_operand (operands[...], <MODE>mode)" instead of
MEM_P && MEM_ALIGN combo in a couple of places.

OK with that change.

Thanks,
Uros.

Reply via email to