Some progress:

I fixed IA64.asm but still saw failures.
I realized I'd not checked the ia64/atomic.h file.

Lo and behold the origin of the bogus "sxt4" is a pair of improper casts,
removed by the following:
--- opal/include/opal/sys/ia64/atomic.h~        2014-01-23
13:04:03.000000000 -0800
+++ opal/include/opal/sys/ia64/atomic.h 2014-01-23 13:04:42.000000000 -0800
@@ -119,7 +119,7 @@
     __asm__ __volatile__ ("cmpxchg8.acq %0=[%1],%2,ar.ccv":
                   "=r"(ret) : "r"(addr), "r"(newval) : "memory");

-    return ((int32_t)ret == oldval);
+    return (ret == oldval);
 }


@@ -132,7 +132,7 @@
     __asm__ __volatile__ ("cmpxchg8.rel %0=[%1],%2,ar.ccv":
                   "=r"(ret) : "r"(addr), "r"(newval) : "memory");

-    return ((int32_t)ret == oldval);
+    return (ret == oldval);
 }

 #endif /* OMPI_GCC_INLINE_ASSEMBLY */


I will retest ASAP and report with, I hope, an attachment to fix both
IA64.asm and ia64/atomic.h

-Paul


On Wed, Jan 22, 2014 at 4:24 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

>
> On Wed, Jan 22, 2014 at 2:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> My ia64 asm is a bit rusty, but I'll give a quick look if/when I can.
>
>
> I had a look (in v1.7) and this is what I see:
>
> $cat -n IA64.asm | grep -A14 opal_atomic_cmpset_acq_64:
>     70  opal_atomic_cmpset_acq_64:
>     71          .prologue
>     72          .body
>     73          mov ar.ccv=r33;;
>     74          cmpxchg8.acq r32=[r32],r34,ar.ccv
>     75          ;;
>     76          sxt4 r32 = r32
>     77          ;;
>     78          cmp.eq p6, p7 = r33, r32
>     79          ;;
>     80          (p6) addl r8 = 1, r0
>     81          (p7) mov r8 = r0
>     82          br.ret.sptk.many b0
>     83          ;;
>     84          .endp opal_atomic_cmpset_acq_64#
>
> The (approximate and non-atomic) C equivalent is:
>
> // r32 = address
> // r33 = oldvalue
> // r34 = newvalue
> int opal_atomic_cmpset_acq_64(int64_t r32, int64_t r33, int64 r34) {
>    int64_t ccv = r33; // L73
>    if (*(int64_t *)r32 == ccv) *(int64_t *)r32 = r34; // L74
>
>    r32 = (int64_t)(int32_t)r32; // L76 = sign-extend 32->64
>
>    bool p6, p7;
>    p7 = !(p6 = (r33 == r32)); // L78
>
>    const int r0 = 0;
>    int r8;
>    if (p6) r8 = 1 + r0; // L80
>    if (p7) r8 = r0; // L81
>    return r8; // L82
> }
>
> Which is fine except that line 76 is totally wrong!!
> The "sxt4" instruction is "sign-extend from 4 bytes to 8 bytes".
> Thus the upper 32-bits of the value read from memory are lost!
> Unless the upper 33 bits off r33 (oldvalue) are all 0s or all 1s, the
> comparison on line 78 MUST fail.
> This explains the hang, as the lifo push will loop indefinitely waiting
> for the success of this cmpset.
>
> Note the same erroneous instruction is also present in the _rel variant
> (at line 94).
> The trunk has the same issue.
> This code has not changed at all since IA64.asm was added way back in
> r4471.
>
> I won't have access to the IA64 platform again until tomorrow AM.
> So, testing my hypothesis will need to wait.
>
> BTW:
> IFF I am right about the source of this problem, then it would be
> beneficial to have (and I may contribute) a stronger test (for "make
> check") that would detect this sort of bug in the atomics (specifically
> look for both false-positive and false-negative return value from 64-bit
> cmpset operations with values satisfying a range of "corner cases").  I
> think I have single-bit and double-bit "marching tests" for cmpset in my
> own arsenal of tests for GASNet's atomics.  If I don't have time to
> contribute a complete test, I can at least contribute that logic for
> somebody else to port to the OPAL atomics.
>
> -Paul
>
> P.S.:
> The cmpxchgN for N in 1,2,4 are documented as ZERO-extending their loads
> to 64-bits.
> So, there is a slim chance that the sxt4 actually was intended for the
> 32-bit cmpset code.
> However, since the comparison used there is a "cmp4.eq" the "sxt4" would
> still not be needed.
>
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to