Re: ping**3 [PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Hi Sandra, Chung-Lin, A couple of comments from me, On 26/05/15 20:10, Sandra Loosemore wrote: Chung-Lin posted this patch last year but it seems never to have been reviewed: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html I've just re-applied and re-tested it and it still seems to be good. Can somebody please take a look at it? -Sandra +mfix-cortex-a9-volatile-hazards +Target Report Var(fix_a9_volatile_hazards) Init(0) +Avoid errata causing read-after-read hazards for concurrent volatile +accesses on Cortex-A9 MPCore processors. s/errata/erratum/ +;; Thumb-2 version allows conditional execution +(define_insn *memory_barrier_t2 + [(set (match_operand:BLK 0 ) + (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] + TARGET_HAVE_MEMORY_BARRIER TARGET_THUMB2 + { +if (TARGET_HAVE_DMB) + { + /* Note we issue a system level barrier. We should consider issuing + a inner shareabilty zone barrier here instead, ie. DMB ISH. */ + /* ??? Differentiate based on SEQ_CST vs less strict? */ + return dmb%?\tsy; + } + +if (TARGET_HAVE_DMB_MCR) + return mcr%?\tp15, 0, r0, c7, c10, 5; + +gcc_unreachable (); + } + [(set_attr length 4) + (set_attr conds nocond) + (set_attr predicable yes)]) + This should also set the 'predicable_short_it' attribute to no since we don't want it to be predicated when compiling for ARMv8-A Thumb2. Consequently: Index: testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c === --- testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c (revision 0) +++ testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c (revision 0) @@ -0,0 +1,14 @@ +/* { dg-do compile { target arm_dmb } } */ +/* { dg-options -O2 -mthumb -mfix-cortex-a9-volatile-hazards } */ Please add a -mno-restrict-it to the options here so that when armv8-a is the default architecture we are still allowed to conditionalise dmb. +static bool +any_volatile_loads_p (const_rtx body) +{ + int i, j; + rtx lhs, rhs; + enum rtx_code code; + const char *fmt; + + if (body == NULL_RTX) +return false; + + code = GET_CODE (body); + + if (code == SET) +{ + lhs = SET_DEST (body); + rhs = SET_SRC (body); + + if (!REG_P (lhs) GET_CODE (lhs) != SUBREG) +return false; + + if ((MEM_P (rhs) || GET_CODE (rhs) == SYMBOL_REF) + MEM_VOLATILE_P (rhs)) +return true; +} + else +{ + fmt = GET_RTX_FORMAT (code); + + for (i = GET_RTX_LENGTH (code) - 1; i = 0; i--) +{ + if (fmt[i] == 'e') + { + if (any_volatile_loads_p (XEXP (body, i))) + return true; + } + else if (fmt[i] == 'E') + for (j = 0; j XVECLEN (body, i); j++) + if (any_volatile_loads_p (XVECEXP (body, i, j))) + return true; + } +} + + return false; +} Would it be simpler to write this using the FOR_EACH_SUBRTX infrastructure? I think it would make this function much shorter. @@ -17248,6 +17334,9 @@ arm_reorg (void) { rtx table; + if (fix_a9_volatile_hazards) + arm_cortex_a9_errata_reorg (insn); + note_invalid_constants (insn, address, true); address += get_attr_length (insn); Does the logic for adding the insn length to address need to be updated in any way since we're inserting a new instruction in the stream? The calculations here always confuse me... Thanks, Kyrill
ping**3 [PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Chung-Lin posted this patch last year but it seems never to have been reviewed: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html I've just re-applied and re-tested it and it still seems to be good. Can somebody please take a look at it? -Sandra
Re: [PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Ping x2. On 14/6/20 2:24 PM, Chung-Lin Tang wrote: Ping. On 2014/6/9 10:03 PM, Chung-Lin Tang wrote: Hi Richard, As we talked about earlier, here's a patch to add a compiler option to work around Cortex-A9 MPCore errata 761319: http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf What the option does basically, is to scan for volatile loads during reorg, and add a dmb barrier after it. It also strives to make dmb conditionally executed under TARGET_THUMB2, which means a new Thumb-2 specific *memory_barrier_t2 pattern in sync.md, with adjusted conds/predicable attributes and %? in output strings. Patch originally written by Julian, with additions by Meador, and finally a few trivial adjustments by me. Again, we've been carrying this fix for a release or two. Okay for trunk? Thanks, Chung-Lin 2014-06-09 Julian Brown jul...@codesourcery.com Meador Inge mead...@codesourcery.com Chung-Lin Tang clt...@codesourcery.com * config/arm/arm.c (arm_option_override): Emit warning if -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU. (any_volatile_loads_p): New. (arm_cortex_a9_errata_reorg): New. (arm_reorg): Call arm_cortex_a9_errata_reorg. * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add option. * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2. (*memory_barrier_t2): New, allow conditional execution on Thumb-2. * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add documentation. testsuite/ * lib/target-supports.exp (check_effective_target_arm_dmb): New. * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.
Re: [PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Ping. On 2014/6/9 10:03 PM, Chung-Lin Tang wrote: Hi Richard, As we talked about earlier, here's a patch to add a compiler option to work around Cortex-A9 MPCore errata 761319: http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf What the option does basically, is to scan for volatile loads during reorg, and add a dmb barrier after it. It also strives to make dmb conditionally executed under TARGET_THUMB2, which means a new Thumb-2 specific *memory_barrier_t2 pattern in sync.md, with adjusted conds/predicable attributes and %? in output strings. Patch originally written by Julian, with additions by Meador, and finally a few trivial adjustments by me. Again, we've been carrying this fix for a release or two. Okay for trunk? Thanks, Chung-Lin 2014-06-09 Julian Brown jul...@codesourcery.com Meador Inge mead...@codesourcery.com Chung-Lin Tang clt...@codesourcery.com * config/arm/arm.c (arm_option_override): Emit warning if -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU. (any_volatile_loads_p): New. (arm_cortex_a9_errata_reorg): New. (arm_reorg): Call arm_cortex_a9_errata_reorg. * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add option. * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2. (*memory_barrier_t2): New, allow conditional execution on Thumb-2. * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add documentation. testsuite/ * lib/target-supports.exp (check_effective_target_arm_dmb): New. * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test.
[PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Hi Richard, As we talked about earlier, here's a patch to add a compiler option to work around Cortex-A9 MPCore errata 761319: http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf What the option does basically, is to scan for volatile loads during reorg, and add a dmb barrier after it. It also strives to make dmb conditionally executed under TARGET_THUMB2, which means a new Thumb-2 specific *memory_barrier_t2 pattern in sync.md, with adjusted conds/predicable attributes and %? in output strings. Patch originally written by Julian, with additions by Meador, and finally a few trivial adjustments by me. Again, we've been carrying this fix for a release or two. Okay for trunk? Thanks, Chung-Lin 2014-06-09 Julian Brown jul...@codesourcery.com Meador Inge mead...@codesourcery.com Chung-Lin Tang clt...@codesourcery.com * config/arm/arm.c (arm_option_override): Emit warning if -mfix-cortex-a9-volatile-hazards is used on an incompatible CPU. (any_volatile_loads_p): New. (arm_cortex_a9_errata_reorg): New. (arm_reorg): Call arm_cortex_a9_errata_reorg. * config/arm/arm.opt (mfix-cortex-a9-volatile-hazards): Add option. * config/arm/sync.md (*memory_barrier): Don't use on Thumb-2. (*memory_barrier_t2): New, allow conditional execution on Thumb-2. * doc/invoke.texi (-mfix-cortex-a9-volatile-hazards): Add documentation. testsuite/ * lib/target-supports.exp (check_effective_target_arm_dmb): New. * gcc.target/arm/a9-volatile-ordering-erratum-1.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-2.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-3.c: New test. * gcc.target/arm/a9-volatile-ordering-erratum-4.c: New test. Index: doc/invoke.texi === --- doc/invoke.texi (revision 211364) +++ doc/invoke.texi (working copy) @@ -535,6 +535,7 @@ Objective-C and Objective-C++ Dialects}. -mtp=@var{name} -mtls-dialect=@var{dialect} @gol -mword-relocations @gol -mfix-cortex-m3-ldrd @gol +-mfix-cortex-a9-volatile-hazards @gol -munaligned-access @gol -mneon-for-64bits @gol -mslow-flash-data @gol @@ -12677,6 +12678,16 @@ with overlapping destination and base registers ar generating these instructions. This option is enabled by default when @option{-mcpu=cortex-m3} is specified. +@item -mfix-cortex-a9-volatile-hazards +@opindex mfix-cortex-a9-volatile-hazards +Cortex-A9 MPCore processors have an erratum that in rare cases cause +successive memory loads to appear out of program order if another processor +is simultaneously writing to the same location. This causes problems if +volatile variables are used for communication between processors. +This option enables the ARM recommended workaround, to insert a @code{dmb} +instruction after each volatile load. Because of the potentially high +overhead, this workaround is not enabled by default. + @item -munaligned-access @itemx -mno-unaligned-access @opindex munaligned-access Index: config/arm/arm.opt === --- config/arm/arm.opt (revision 211364) +++ config/arm/arm.opt (working copy) @@ -264,6 +264,11 @@ Target Report Var(fix_cm3_ldrd) Init(2) Avoid overlapping destination and address registers on LDRD instructions that may trigger Cortex-M3 errata. +mfix-cortex-a9-volatile-hazards +Target Report Var(fix_a9_volatile_hazards) Init(0) +Avoid errata causing read-after-read hazards for concurrent volatile +accesses on Cortex-A9 MPCore processors. + munaligned-access Target Report Var(unaligned_access) Init(2) Enable unaligned word and halfword accesses to packed data. Index: config/arm/sync.md === --- config/arm/sync.md (revision 211364) +++ config/arm/sync.md (working copy) @@ -46,7 +46,7 @@ (define_insn *memory_barrier [(set (match_operand:BLK 0 ) (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] - TARGET_HAVE_MEMORY_BARRIER + TARGET_HAVE_MEMORY_BARRIER !TARGET_THUMB2 { if (TARGET_HAVE_DMB) { @@ -65,6 +65,29 @@ (set_attr conds unconditional) (set_attr predicable no)]) +;; Thumb-2 version allows conditional execution +(define_insn *memory_barrier_t2 + [(set (match_operand:BLK 0 ) + (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] + TARGET_HAVE_MEMORY_BARRIER TARGET_THUMB2 + { +if (TARGET_HAVE_DMB) + { + /* Note we issue a system level barrier. We should consider issuing + a inner shareabilty zone barrier here instead, ie. DMB ISH. */ + /* ??? Differentiate based on SEQ_CST vs less strict? */ + return dmb%?\tsy; + } + +if (TARGET_HAVE_DMB_MCR) + return mcr%?\tp15, 0, r0, c7, c10, 5; + +gcc_unreachable (); + } + [(set_attr length 4) + (set_attr conds nocond) + (set_attr predicable