On 13.02.19 07:39, Peng Fan wrote:


-----Original Message-----
From: Jan Kiszka [mailto:[email protected]]
Sent: 2019年2月13日 14:32
To: Peng Fan <[email protected]>; [email protected]
Cc: [email protected]; [email protected]
Subject: Re: [PATCH] arm64: bitops: fix test_and_set_bit

On 13.02.19 05:48, Peng Fan wrote:
When compiling code using poky gcc 8.2, met the warning:
"
    CC      hypervisor/printk.o
/tmp/cclKpFV2.s: Assembler messages:
/tmp/cclKpFV2.s:1306: Warning: unpredictable: identical transfer and status
registers --`stxr w4,x5,[x4]'
"

According to DDI0487D_a_armv8_arm, section "C6.2.285 STXR", "
if s == n && n != 31 then
      Constraint c = ConstrainUnpredictable();
      assert c IN {Constraint_UNKNOWN, Constraint_NONE,
Constraint_UNDEF, Constraint_NOP};
      case c of
          when Constraint_UNKNOWN rn_unknown = TRUE; // address is
UNKNOWN
          when Constraint_NONE rn_unknown = FALSE; // address is
original base
          when Constraint_UNDEF UNDEFINED;
          when Constraint_NOP EndOfInstruction(); "
And ConstrainUnpredictable means:
"
shared/functions/unpredictable/ConstrainUnpredictable
// Return the appropriate Constraint result to control the caller's
behavior. The return value // is IMPLEMENTATION DEFINED within a
permitted list for each UNPREDICTABLE case.
// (The permitted list is determined by an assert or case statement at
the call site.) "

So we need to avoid the situation that s == n returns true.

Without this patch, the asm code as following in panic_prink

      ffffc0207ac0:       91006024        add     x4, x1, #0x18
      ffffc0207ac4:       c85f7c85        ldxr    x5, [x4]
      ffffc0207ac8:       ea0200a3        ands    x3, x5, x2
      ffffc0207acc:       54000041        b.ne    ffffc0207ad4
<panic_printk+0x40>  // b.any
      ffffc0207ad0:       aa0200a5        orr     x5, x5, x2
      ffffc0207ad4:       c8047c85        stxr    w4, x5, [x4]
      ffffc0207ad8:       d5033bbf        dmb     ish
      ffffc0207adc:       35ffff24        cbnz    w4, ffffc0207ac0
<panic_printk+0x2c>

With this patch, the asm code as following:
      ffffc0207a10:       c85f7c44        ldxr    x4, [x2]
      ffffc0207a14:       ea010083        ands    x3, x4, x1
      ffffc0207a18:       54000041        b.ne    ffffc0207a20
<panic_printk+0x40>  // b.any
      ffffc0207a1c:       aa010084        orr     x4, x4, x1
      ffffc0207a20:       c8017c44        stxr    w1, x4, [x2]
      ffffc0207a24:       d5033bbf        dmb     ish
      ffffc0207a28:       35ffff41        cbnz    w1, ffffc0207a10
<panic_printk+0x30>

Note: not sure this is gcc bug or not, but arm64 poky gcc also
generated same code, but no warning.

Probably worth to ask gcc folks at least.


Signed-off-by: Peng Fan <[email protected]>
---
   hypervisor/arch/arm64/include/asm/bitops.h | 28
++++++++++++++--------------
   1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/hypervisor/arch/arm64/include/asm/bitops.h
b/hypervisor/arch/arm64/include/asm/bitops.h
index c4b1ea46..aa53a6f9 100644
--- a/hypervisor/arch/arm64/include/asm/bitops.h
+++ b/hypervisor/arch/arm64/include/asm/bitops.h
@@ -83,21 +83,21 @@ static inline int test_and_set_bit(int nr,
volatile unsigned long *addr)

        /* AARCH64_TODO: using Inner Shareable DMB at the moment,
         * revisit when we will deal with shareability domains */
+       asm volatile (
+               "1:\n\t"
+               "ldxr      %3, %2\n\t"
+               "ands      %1, %3, %4\n\t"
+               "b.ne      2f\n\t"
+               "orr       %3, %3, %4\n\t"
+               "2:\n\t"
+               "stxr      %w0, %3, %2\n\t"
+               "dmb    ish\n\t"
+               "cbnz      %w0, 1b\n\t"
+               : "=r" (ret), "=&r" (test),
+                 "+Q" (*(volatile unsigned long *)addr),
+                 "=r" (tmp)
+               : "r" (1ul << nr));

I supposes this folding of the loop into assembly simple shakes the register
allocator a bit to come up with a different assignment that happens to not
trigger the issue. Or is there some other trick in this?

I think it will block gcc to optimize c code and asm code together after folder
the loop using asm code. I'll post question to gcc community see if there any
insights.

That would be good. We need some confidence that what we are doing is reliable and, e.g., is not causing the issue to appear with older gcc.

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

--
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to