Re: ping**3 [PATCH, ARM] Cortex-A9 MPCore volatile load workaround

2015-05-27 Thread Kyrill Tkachov

Hi Sandra, Chung-Lin,

A couple of comments from me,

On 26/05/15 20:10, Sandra Loosemore wrote:

Chung-Lin posted this patch last year but it seems never to have been
reviewed:

https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html

I've just re-applied and re-tested it and it still seems to be good.
Can somebody please take a look at it?

-Sandra



+mfix-cortex-a9-volatile-hazards
+Target Report Var(fix_a9_volatile_hazards) Init(0)
+Avoid errata causing read-after-read hazards for concurrent volatile
+accesses on Cortex-A9 MPCore processors.

s/errata/erratum/


+;; Thumb-2 version allows conditional execution
+(define_insn *memory_barrier_t2
+  [(set (match_operand:BLK 0  )
+   (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
+  TARGET_HAVE_MEMORY_BARRIER  TARGET_THUMB2
+  {
+if (TARGET_HAVE_DMB)
+  {
+   /* Note we issue a system level barrier. We should consider issuing
+  a inner shareabilty zone barrier here instead, ie. DMB ISH.  */
+   /* ??? Differentiate based on SEQ_CST vs less strict?  */
+   return dmb%?\tsy;
+  }
+
+if (TARGET_HAVE_DMB_MCR)
+  return mcr%?\tp15, 0, r0, c7, c10, 5;
+
+gcc_unreachable ();
+  }
+  [(set_attr length 4)
+   (set_attr conds nocond)
+   (set_attr predicable yes)])
+

This should also set the 'predicable_short_it' attribute to no
since we don't want it to be predicated when compiling for ARMv8-A Thumb2.
Consequently:

Index: testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c
===
--- testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c   (revision 0)
+++ testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c   (revision 0)
@@ -0,0 +1,14 @@
+/* { dg-do compile { target arm_dmb } } */
+/* { dg-options -O2 -mthumb -mfix-cortex-a9-volatile-hazards } */

Please add a -mno-restrict-it to the options here so that when armv8-a is the 
default architecture
we are still allowed to conditionalise dmb.

+static bool
+any_volatile_loads_p (const_rtx body)
+{
+  int i, j;
+  rtx lhs, rhs;
+  enum rtx_code code;
+  const char *fmt;
+
+  if (body == NULL_RTX)
+return false;
+
+  code = GET_CODE (body);
+
+  if (code == SET)
+{
+  lhs = SET_DEST (body);
+  rhs = SET_SRC (body);
+
+  if (!REG_P (lhs)  GET_CODE (lhs) != SUBREG)
+return false;
+
+  if ((MEM_P (rhs) || GET_CODE (rhs) == SYMBOL_REF)
+  MEM_VOLATILE_P (rhs))
+return true;
+}
+  else
+{
+  fmt = GET_RTX_FORMAT (code);
+
+  for (i = GET_RTX_LENGTH (code) - 1; i = 0; i--)
+{
+ if (fmt[i] == 'e')
+   {
+ if (any_volatile_loads_p (XEXP (body, i)))
+   return true;
+   }
+ else if (fmt[i] == 'E')
+   for (j = 0; j  XVECLEN (body, i); j++)
+ if (any_volatile_loads_p (XVECEXP (body, i, j)))
+   return true;
+   }
+}
+
+  return false;
+}


Would it be simpler to write this using the FOR_EACH_SUBRTX infrastructure? I 
think it would make this function much shorter.

@@ -17248,6 +17334,9 @@ arm_reorg (void)
{
  rtx table;
 
+	  if (fix_a9_volatile_hazards)

+   arm_cortex_a9_errata_reorg (insn);
+
  note_invalid_constants (insn, address, true);
  address += get_attr_length (insn);

Does the logic for adding the insn length to address need to be updated in any 
way since we're inserting a new instruction
in the stream? The calculations here always confuse me...



Thanks,
Kyrill





ping**3 [PATCH, ARM] Cortex-A9 MPCore volatile load workaround

2015-05-26 Thread Sandra Loosemore
Chung-Lin posted this patch last year but it seems never to have been 
reviewed:


https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html

I've just re-applied and re-tested it and it still seems to be good. 
Can somebody please take a look at it?


-Sandra



Re: [PATCH, ARM] Cortex-A9 MPCore volatile load workaround

2014-06-26 Thread Chung-Lin Tang
Ping x2.

On 14/6/20 2:24 PM, Chung-Lin Tang wrote:
 Ping.
 
 On 2014/6/9 10:03 PM, Chung-Lin Tang wrote:
 Hi Richard,
 As we talked about earlier, here's a patch to add a compiler option to
 work around Cortex-A9 MPCore errata 761319:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf

 What the option does basically, is to scan for volatile loads during
 reorg, and add a dmb barrier after it. It also strives to make dmb
 conditionally executed under TARGET_THUMB2, which means a new Thumb-2
 specific *memory_barrier_t2 pattern in sync.md, with adjusted
 conds/predicable attributes and %? in output strings.

 Patch originally written by Julian, with additions by Meador, and
 finally a few trivial adjustments by me. Again, we've been carrying this
 fix for a release or two. Okay for trunk?

 Thanks,
 Chung-Lin

 2014-06-09  Julian Brown  jul...@codesourcery.com
 Meador Inge  mead...@codesourcery.com
 Chung-Lin Tang  clt...@codesourcery.com

  * config/arm/arm.c (arm_option_override): Emit warning if
  -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU.
  (any_volatile_loads_p): New.
  (arm_cortex_a9_errata_reorg): New.
  (arm_reorg): Call arm_cortex_a9_errata_reorg.
  * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add
  option.
  * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2.
  (*memory_barrier_t2): New, allow conditional execution on
  Thumb-2.
  * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add
  documentation.

 testsuite/
 * lib/target-supports.exp (check_effective_target_arm_dmb): New.
 * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.

 



Re: [PATCH, ARM] Cortex-A9 MPCore volatile load workaround

2014-06-20 Thread Chung-Lin Tang
Ping.

On 2014/6/9 10:03 PM, Chung-Lin Tang wrote:
 Hi Richard,
 As we talked about earlier, here's a patch to add a compiler option to
 work around Cortex-A9 MPCore errata 761319:
 http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf
 
 What the option does basically, is to scan for volatile loads during
 reorg, and add a dmb barrier after it. It also strives to make dmb
 conditionally executed under TARGET_THUMB2, which means a new Thumb-2
 specific *memory_barrier_t2 pattern in sync.md, with adjusted
 conds/predicable attributes and %? in output strings.
 
 Patch originally written by Julian, with additions by Meador, and
 finally a few trivial adjustments by me. Again, we've been carrying this
 fix for a release or two. Okay for trunk?
 
 Thanks,
 Chung-Lin
 
 2014-06-09  Julian Brown  jul...@codesourcery.com
 Meador Inge  mead...@codesourcery.com
 Chung-Lin Tang  clt...@codesourcery.com
 
   * config/arm/arm.c (arm_option_override): Emit warning if
   -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU.
   (any_volatile_loads_p): New.
   (arm_cortex_a9_errata_reorg): New.
   (arm_reorg): Call arm_cortex_a9_errata_reorg.
   * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add
   option.
   * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2.
   (*memory_barrier_t2): New, allow conditional execution on
   Thumb-2.
   * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add
   documentation.
 
 testsuite/
 * lib/target-supports.exp (check_effective_target_arm_dmb): New.
 * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test.
 * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.
 



[PATCH, ARM] Cortex-A9 MPCore volatile load workaround

2014-06-09 Thread Chung-Lin Tang
Hi Richard,
As we talked about earlier, here's a patch to add a compiler option to
work around Cortex-A9 MPCore errata 761319:
http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf

What the option does basically, is to scan for volatile loads during
reorg, and add a dmb barrier after it. It also strives to make dmb
conditionally executed under TARGET_THUMB2, which means a new Thumb-2
specific *memory_barrier_t2 pattern in sync.md, with adjusted
conds/predicable attributes and %? in output strings.

Patch originally written by Julian, with additions by Meador, and
finally a few trivial adjustments by me. Again, we've been carrying this
fix for a release or two. Okay for trunk?

Thanks,
Chung-Lin

2014-06-09  Julian Brown  jul...@codesourcery.com
Meador Inge  mead...@codesourcery.com
Chung-Lin Tang  clt...@codesourcery.com

* config/arm/arm.c (arm_option_override): Emit warning if
-mfix-cortex-a9-volatile-hazards is used on an incompatible CPU.
(any_volatile_loads_p): New.
(arm_cortex_a9_errata_reorg): New.
(arm_reorg): Call arm_cortex_a9_errata_reorg.
* config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add
option.
* config/arm/sync.md (*memory_barrier): Don't use on Thumb-2.
(*memory_barrier_t2): New, allow conditional execution on
Thumb-2.
* doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add
documentation.

testsuite/
* lib/target-supports.exp (check_effective_target_arm_dmb): New.
* gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test.
* gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test.
* gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test.
* gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 211364)
+++ doc/invoke.texi	(working copy)
@@ -535,6 +535,7 @@ Objective-C and Objective-C++ Dialects}.
 -mtp=@var{name} -mtls-dialect=@var{dialect} @gol
 -mword-relocations @gol
 -mfix-cortex-m3-ldrd @gol
+-mfix-cortex-a9-volatile-hazards @gol
 -munaligned-access @gol
 -mneon-for-64bits @gol
 -mslow-flash-data @gol
@@ -12677,6 +12678,16 @@ with overlapping destination and base registers ar
 generating these instructions.  This option is enabled by default when
 @option{-mcpu=cortex-m3} is specified.
 
+@item -mfix-cortex-a9-volatile-hazards
+@opindex mfix-cortex-a9-volatile-hazards
+Cortex-A9 MPCore processors have an erratum that in rare cases cause
+successive memory loads to appear out of program order if another processor
+is simultaneously writing to the same location.  This causes problems if
+volatile variables are used for communication between processors.
+This option enables the ARM recommended workaround, to insert a @code{dmb}
+instruction after each volatile load.  Because of the potentially high
+overhead, this workaround is not enabled by default.
+
 @item -munaligned-access
 @itemx -mno-unaligned-access
 @opindex munaligned-access
Index: config/arm/arm.opt
===
--- config/arm/arm.opt	(revision 211364)
+++ config/arm/arm.opt	(working copy)
@@ -264,6 +264,11 @@ Target Report Var(fix_cm3_ldrd) Init(2)
 Avoid overlapping destination and address registers on LDRD instructions
 that may trigger Cortex-M3 errata.
 
+mfix-cortex-a9-volatile-hazards
+Target Report Var(fix_a9_volatile_hazards) Init(0)
+Avoid errata causing read-after-read hazards for concurrent volatile
+accesses on Cortex-A9 MPCore processors.
+
 munaligned-access
 Target Report Var(unaligned_access) Init(2)
 Enable unaligned word and halfword accesses to packed data.
Index: config/arm/sync.md
===
--- config/arm/sync.md	(revision 211364)
+++ config/arm/sync.md	(working copy)
@@ -46,7 +46,7 @@
 (define_insn *memory_barrier
   [(set (match_operand:BLK 0  )
 	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
-  TARGET_HAVE_MEMORY_BARRIER
+  TARGET_HAVE_MEMORY_BARRIER  !TARGET_THUMB2
   {
 if (TARGET_HAVE_DMB)
   {
@@ -65,6 +65,29 @@
(set_attr conds unconditional)
(set_attr predicable no)])
 
+;; Thumb-2 version allows conditional execution
+(define_insn *memory_barrier_t2
+  [(set (match_operand:BLK 0  )
+	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))]
+  TARGET_HAVE_MEMORY_BARRIER  TARGET_THUMB2
+  {
+if (TARGET_HAVE_DMB)
+  {
+	/* Note we issue a system level barrier. We should consider issuing
+	   a inner shareabilty zone barrier here instead, ie. DMB ISH.  */
+	/* ??? Differentiate based on SEQ_CST vs less strict?  */
+	return dmb%?\tsy;
+  }
+
+if (TARGET_HAVE_DMB_MCR)
+  return mcr%?\tp15, 0, r0, c7, c10, 5;
+
+gcc_unreachable ();
+  }
+  [(set_attr length 4)
+   (set_attr conds nocond)
+   (set_attr predicable