Re: [PATCH, aarch64] Fix target/70120

2016-03-25 Thread Jeff Law

On 03/21/2016 11:44 AM, Richard Henderson wrote:

On 03/21/2016 06:40 AM, Jiong Wang wrote:

On 17/03/16 19:17, Richard Henderson wrote:

PR target/70120
* varasm.c (for_each_section): New.
* varasm.h (for_each_section): Declare.
* config/aarch64/aarch64.c (aarch64_align_code_section): New.
(aarch64_asm_file_end): New.
(TARGET_ASM_FILE_END): Redefine.


Will ASM_OUTPUT_POOL_EPILOGUE be a better place to fix this issue?
which can
avoid the for_each_section traversal.


It's a good point.  I hadn't noticed this old (and currently unused) hook.
This alternate patch does in fact work.


r~

z


* config/aarch64/aarch64.c (aarch64_asm_output_pool_epilogue): New.
* config/aarch64/aarch64-protos.h: Declare it.
* config/aarch64/aarch64.h (ASM_OUTPUT_POOL_EPILOGUE): New.
Approved and installed (along with the 3 testcases from the original 
submission.


Thanks,
Jeff



Re: [PATCH, aarch64] Fix target/70120

2016-03-21 Thread Jiong Wang

Richard Henderson writes:

> On 03/21/2016 06:40 AM, Jiong Wang wrote:
>> On 17/03/16 19:17, Richard Henderson wrote:
>>> PR target/70120
>>> * varasm.c (for_each_section): New.
>>> * varasm.h (for_each_section): Declare.
>>> * config/aarch64/aarch64.c (aarch64_align_code_section): New.
>>> (aarch64_asm_file_end): New.
>>> (TARGET_ASM_FILE_END): Redefine.
>>
>> Will ASM_OUTPUT_POOL_EPILOGUE be a better place to fix this issue? which can
>> avoid the for_each_section traversal.
>
> It's a good point.  I hadn't noticed this old (and currently unused) hook.
> This alternate patch does in fact work.

Thanks, looks good to me, defer to maintainers.

-- 
Regards,
Jiong


Re: [PATCH, aarch64] Fix target/70120

2016-03-21 Thread Richard Henderson

On 03/21/2016 06:40 AM, Jiong Wang wrote:

On 17/03/16 19:17, Richard Henderson wrote:

PR target/70120
* varasm.c (for_each_section): New.
* varasm.h (for_each_section): Declare.
* config/aarch64/aarch64.c (aarch64_align_code_section): New.
(aarch64_asm_file_end): New.
(TARGET_ASM_FILE_END): Redefine.


Will ASM_OUTPUT_POOL_EPILOGUE be a better place to fix this issue? which can
avoid the for_each_section traversal.


It's a good point.  I hadn't noticed this old (and currently unused) hook.
This alternate patch does in fact work.


r~
* config/aarch64/aarch64.c (aarch64_asm_output_pool_epilogue): New.
* config/aarch64/aarch64-protos.h: Declare it.
* config/aarch64/aarch64.h (ASM_OUTPUT_POOL_EPILOGUE): New.


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index dced209..58c9d0d 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -429,4 +429,8 @@ bool extract_base_offset_in_addr (rtx mem, rtx *base, rtx 
*offset);
 bool aarch64_operands_ok_for_ldpstp (rtx *, bool, enum machine_mode);
 bool aarch64_operands_adjust_ok_for_ldpstp (rtx *, bool, enum machine_mode);
 extern bool aarch64_nopcrelative_literal_loads;
+
+extern void aarch64_asm_output_pool_epilogue (FILE *, const char *,
+ tree, HOST_WIDE_INT);
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cf1239d..732ed70 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5579,6 +5579,18 @@ aarch64_select_rtx_section (machine_mode mode,
   return default_elf_select_rtx_section (mode, x, align);
 }
 
+/* Implement ASM_OUTPUT_POOL_EPILOGUE.  */
+void
+aarch64_asm_output_pool_epilogue (FILE *f, const char *, tree,
+ HOST_WIDE_INT offset)
+{
+  /* When using per-function literal pools, we must ensure that any code
+ section is aligned to the minimal instruction length, lest we get
+ errors from the assembler re "unaligned instructions".  */
+  if ((offset & 3) && aarch64_can_use_per_function_literal_pools_p ())
+ASM_OUTPUT_ALIGN (f, 2);
+}
+
 /* Costs.  */
 
 /* Helper function for rtx cost calculation.  Strip a shift expression
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index ec96ce3..7750d1c 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -928,4 +928,6 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 #define EXTRA_SPECS\
   { "asm_cpu_spec",ASM_CPU_SPEC }
 
+#define ASM_OUTPUT_POOL_EPILOGUE  aarch64_asm_output_pool_epilogue
+
 #endif /* GCC_AARCH64_H */


Re: [PATCH, aarch64] Fix target/70120

2016-03-21 Thread Jiong Wang

On 17/03/16 19:17, Richard Henderson wrote:

PR target/70120
* varasm.c (for_each_section): New.
* varasm.h (for_each_section): Declare.
* config/aarch64/aarch64.c (aarch64_align_code_section): New.
(aarch64_asm_file_end): New.
(TARGET_ASM_FILE_END): Redefine.


Will ASM_OUTPUT_POOL_EPILOGUE be a better place to fix this issue? which can
avoid the for_each_section traversal.




r~




Re: [PATCH, aarch64] Fix target/70120

2016-03-21 Thread Bernd Schmidt

On 03/17/2016 08:17 PM, Richard Henderson wrote:

With -g, and a code section that ends unaligned, the assembler complains of
"unaligned opcodes detected".  Except there are no such unaligned opcodes, nor
dwarf2 code ranges covering the end of the section, which arguably makes this
an assembler bug.  However, it's reasonably easy to work around in the
compiler, which saves having to bump the required binutils version.

Tested on aarch64-linux.


Ok for the varasm bits.


Bernd


[PATCH, aarch64] Fix target/70120

2016-03-19 Thread Richard Henderson
With -g, and a code section that ends unaligned, the assembler complains of
"unaligned opcodes detected".  Except there are no such unaligned opcodes, nor
dwarf2 code ranges covering the end of the section, which arguably makes this
an assembler bug.  However, it's reasonably easy to work around in the
compiler, which saves having to bump the required binutils version.

Tested on aarch64-linux.


r~
PR target/70120
* varasm.c (for_each_section): New.
* varasm.h (for_each_section): Declare.
* config/aarch64/aarch64.c (aarch64_align_code_section): New.
(aarch64_asm_file_end): New.
(TARGET_ASM_FILE_END): Redefine.

testsuite/
* gcc.target/aarch64/pr70120-1.c: New.
* gcc.target/aarch64/pr70120-2.c: New.
* gcc.target/aarch64/pr70120-3.c: New.


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cf1239d..cca9bd9 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -13989,6 +13989,39 @@ aarch64_optab_supported_p (int op, machine_mode, 
machine_mode,
 }
 }
 
+/* A subroutine of aarch64_asm_file_end.  Callback to align the
+   given section if it contains code.  */
+
+static void
+aarch64_align_code_section (section *s)
+{
+  if (s->common.flags & SECTION_CODE)
+{
+  switch_to_section (s);
+  ASM_OUTPUT_ALIGN (asm_out_file, 2);
+}
+}
+
+/* Implement the TARGET_ASM_FILE_END hook.  */
+
+static void
+aarch64_asm_file_end (void)
+{
+  /* When using per-function literal pools, we must ensure that any code
+ section is aligned to the minimal instruction length, lest we get
+ errors from the assembler re "unaligned instructions".  */
+  if (aarch64_can_use_per_function_literal_pools_p ())
+for_each_section (aarch64_align_code_section);
+
+  /* If a subtarget has already defined this hook, call it.  */
+#ifdef TARGET_ASM_FILE_END
+  TARGET_ASM_FILE_END ();
+#endif
+}
+
+#undef TARGET_ASM_FILE_END
+#define TARGET_ASM_FILE_END aarch64_asm_file_end
+
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST aarch64_address_cost
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr70120-1.c 
b/gcc/testsuite/gcc.target/aarch64/pr70120-1.c
new file mode 100644
index 000..31a5e94
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr70120-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-Og -fschedule-insns -mno-pc-relative-literal-loads -g" } */
+
+typedef short v32u16 __attribute__ ((vector_size (32)));
+typedef int v32u32 __attribute__ ((vector_size (32)));
+typedef long v32u64 __attribute__ ((vector_size (32)));
+typedef __int128 u128;
+typedef __int128 v32u128 __attribute__ ((vector_size (32)));
+
+int
+foo(int u16_0, int u32_0, int u64_0, u128 u128_0, int u16_1, int u32_1, int 
u64_1, u128 u128_1, v32u16 v32u16_0, v32u32 v32u32_0, v32u64 v32u64_0, v32u128 
v32u128_0, v32u16 v32u16_1, v32u32 v32u32_1, v32u64 v32u64_1, v32u128 v32u128_1)
+{
+  v32u32_1 ^= (v32u32) ~ v32u64_0;
+  v32u32_1 %= (v32u32) - v32u16_1 | 1;
+  v32u16_1 -= (v32u16) v32u16_1;
+  v32u64_0 *= (v32u64){~ u128_0, v32u16_1[5], v32u16_0[15], v32u32_1[4]};
+  v32u16_0 /= (v32u16){0x574c, ~u128_1, v32u128_1[0], u64_1, v32u64_0[1], 
v32u64_1[2], 0, 0x8ce6, u128_1, 0x5e69} |1;
+  return v32u16_0[0] + v32u16_0[6] + v32u16_0[8] + v32u16_0[9] + v32u32_0[0] + 
v32u32_0[1] + v32u32_0[2] + v32u32_0[3] + v32u32_0[4] + v32u32_0[6] + 
v32u64_0[0] + v32u64_0[2] + v32u64_0[3] + v32u128_0[0] + v32u128_0[1] + 
v32u32_1[0] + v32u32_1[2] + v32u64_1[2] + v32u64_1[3] + v32u128_1[1];
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/pr70120-2.c 
b/gcc/testsuite/gcc.target/aarch64/pr70120-2.c
new file mode 100644
index 000..0110224
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr70120-2.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-Og -freorder-functions -g3 -mcmodel=large" } */
+
+typedef short v32u16 __attribute__ ((vector_size (32)));
+typedef int v32u32 __attribute__ ((vector_size (32)));
+typedef long v32u64 __attribute__ ((vector_size (32)));
+typedef __int128 u128;
+typedef __int128 v32u128 __attribute__ ((vector_size (32)));
+
+int
+foo (int u16_0, int u32_0, int u64_0, u128 u128_0, int u16_1, int u32_1, 
v32u16 v32u16_0, v32u32 v32u32_0, v32u64 v32u64_0, v32u128 v32u128_0, v32u16 
v32u16_1, v32u32 v32u32_1, v32u64 v32u64_1, v32u128 v32u128_1)
+{
+  u128_0 <<= 0x6c;
+  v32u16_1 %= (v32u16) { 1, 64, 0xf294, 0, u32_1, v32u32_1[6], ~u128_0, 
0x2912, v32u32_0[2]} | 1;
+  v32u16_0 ^= (v32u16){-v32u16_1[11], -u32_1, 64, ~u128_0, 0, 1, 64, ~u64_0, 
0};
+  return u16_0 + u32_0 + u16_1 + v32u16_0[0] + v32u32_0[1] + v32u32_0[2] + 
v32u32_0[4] + v32u32_0[6] + v32u64_0[0] + v32u64_0[1] + v32u64_0[2] + 
v32u64_0[3] + v32u128_0[0] + v32u128_0[1] + v32u16_1[0] + v32u32_1[7] + 
v32u64_1[0] + v32u64_1[1] + v32u64_1[2] + v32u64_1[3] + v32u128_1[0] + 
v32u128_1[1];
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/pr70120-3.c 
b/gcc/testsuite/gcc.target/aarch64/pr70120-3.c
new file mode 100644