On 13 March 2018 at 16:35, Maxim Uvarov <maxim.uva...@linaro.org> wrote:
> docker run -i -t 5b1f9964e594 /bin/bash
> apt-get update
> apt-get install git
> git clone https://github.com/Linaro/odp.git
> ./bootstrap
> ./configure --disable-test-perf --disable-test-perf-proc
> make -j 8
> export ODP_SCHEDULER=scalable
> ./helper/test/cuckootable
> works!
>
> ./configure --disable-test-perf --disable-test-perf-proc CFLAGS="-O0 -g"
> ./helper/test/cuckootable
> hungs!

Also I do not see, why will this test pass under Shippable (and it does).

>
> export ODP_SCHEDULER=basic
> ./helper/test/cuckootable
> works again!
>
>
> On 13 March 2018 at 10:34, Maxim Uvarov <maxim.uva...@linaro.org> wrote:
>>
>> CC odp ML for this issue.
>>
>> Maxim.
>>
>> On 13 March 2018 at 03:33, Bill Fischofer <bill.fischo...@linaro.org>
>> wrote:
>>>
>>> Additional details.
>>>
>>> __atomic_load_n() is a GCC intrinsic, however __lockfree_load_16() is
>>> defined in platform/linux-generic/arch/aarch64/odp_atomic.h:
>>>
>>> static inline __int128 __lockfree_load_16(__int128 *var, int mo)
>>> {
>>>         __int128 old = *var; /* Possibly torn read */
>>>
>>>         /* Do CAS to ensure atomicity
>>>          * Either CAS succeeds (writing back the same value)
>>>          * Or CAS fails and returns the old value (atomic read)
>>>          */
>>>         (void)__lockfree_compare_exchange_16(var, &old, old, false, mo,
>>> mo);
>>>         return old;
>>> }
>>>
>>> As is __lockfree_compare_exchange_16():
>>>
>>> static inline bool
>>> __lockfree_compare_exchange_16(register __int128 *var, __int128 *exp,
>>>        register __int128 neu, bool weak, int mo_success,
>>>        int mo_failure)
>>> {
>>> (void)weak; /* Always do strong CAS or we can't perform atomic read */
>>> /* Ignore memory ordering for failure, memory order for
>>> * success must be stronger or equal. */
>>> (void)mo_failure;
>>> register __int128 old;
>>> register __int128 expected;
>>> int ll_mo = LL_MO(mo_success);
>>> int sc_mo = SC_MO(mo_success);
>>>
>>> expected = *exp;
>>> __asm__ volatile("" ::: "memory");
>>> do {
>>> /* Atomicity of LLD is not guaranteed */
>>> old = lld(var, ll_mo);
>>> /* Must write back neu or old to verify atomicity of LLD */
>>> } while (odp_unlikely(scd(var, old == expected ? neu : old, sc_mo)));
>>> *exp = old; /* Always update, atomically read value */
>>> return old == expected;
>>> }
>>>
>>> In turn lld() and scd() are defined in
>>> platform/linux-generic/arch/aarch64/odp_llsc.h:
>>>
>>> static inline __int128 lld(__int128 *var, int mm)
>>> {
>>> union i128 old;
>>>
>>> if (mm == __ATOMIC_ACQUIRE)
>>> __asm__ volatile("ldaxp %0, %1, [%2]"
>>> : "=&r" (old.i64[0]), "=&r" (old.i64[1])
>>> : "r" (var)
>>> : "memory");
>>> else if (mm == __ATOMIC_RELAXED)
>>> __asm__ volatile("ldxp %0, %1, [%2]"
>>> : "=&r" (old.i64[0]), "=&r" (old.i64[1])
>>> : "r" (var)
>>> : );
>>> else
>>> ODP_ABORT();
>>> return old.i128;
>>> }
>>>
>>> /* Return 0 on success, 1 on failure */
>>> static inline uint32_t scd(__int128 *var, __int128 neu, int mm)
>>> {
>>> uint32_t ret;
>>>
>>> if (mm == __ATOMIC_RELEASE)
>>> __asm__ volatile("stlxp %w0, %1, %2, [%3]"
>>> : "=&r" (ret)
>>> : "r" (((union i128)neu).i64[0]),
>>>    "r" (((union i128)neu).i64[1]),
>>>    "r" (var)
>>> : "memory");
>>> else if (mm == __ATOMIC_RELAXED)
>>> __asm__ volatile("stxp %w0, %1, %2, [%3]"
>>> : "=&r" (ret)
>>> : "r" (((union i128)neu).i64[0]),
>>>    "r" (((union i128)neu).i64[1]),
>>>    "r" (var)
>>> : );
>>> else
>>> ODP_ABORT();
>>> return ret;
>>> }
>>>
>>> So these boil down to a sequence of __asm__() instructions. If these are
>>> hanging it suggests a compiler issue. Does this occur with a newer GCC
>>> level?
>>>
>>> On Mon, Mar 12, 2018 at 5:21 PM, Maxim Uvarov <maxim.uva...@linaro.org>
>>> wrote:
>>>>
>>>> gcc -v
>>>> Using built-in specs.
>>>> COLLECT_GCC=gcc
>>>> COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/4.8/lto-wrapper
>>>> Target: aarch64-linux-gnu
>>>> Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
>>>> 4.8.5-4ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs
>>>> --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
>>>> --program-suffix=-4.8 --enable-shared --enable-linker-build-id
>>>> --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
>>>> --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls
>>>> --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug
>>>> --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libmudflap
>>>> --disable-libsanitizer --disable-libquadmath --enable-plugin
>>>> --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk
>>>> --enable-gtk-cairo
>>>> --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-arm64/jre
>>>> --enable-java-home 
>>>> --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-arm64
>>>> --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-arm64
>>>> --with-arch-directory=arm64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
>>>> --enable-multiarch --disable-werror --enable-checking=release
>>>> --build=aarch64-linux-gnu --host=aarch64-linux-gnu
>>>> --target=aarch64-linux-gnu
>>>> Thread model: posix
>>>> gcc version 4.8.5 (Ubuntu/Linaro 4.8.5-4ubuntu2)
>>>>
>>>> On 13 March 2018 at 00:20, Maxim Uvarov <maxim.uva...@linaro.org> wrote:
>>>>>
>>>>> this fixes a problem. But it's too late today to do clean patch. (fun
>>>>> debug if gdb does not work under docker).
>>>>> So it might be something thunder-x specific.
>>>>>
>>>>>
>>>>> --- a/platform/linux-generic/include/odp_bitset.h
>>>>> +++ b/platform/linux-generic/include/odp_bitset.h
>>>>> @@ -27,7 +27,7 @@
>>>>>  /* Find a suitable data type that supports lock-free atomic operations
>>>>> */
>>>>>  #if defined(__aarch64__) && defined(__SIZEOF_INT128__) && \
>>>>>         __SIZEOF_INT128__ == 16
>>>>> -#define LOCKFREE16
>>>>> +// #define LOCKFREE16
>>>>>  typedef __int128 bitset_t;
>>>>>  #define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT128__)
>>>>>
>>>>>
>>>>> On 13 March 2018 at 00:14, Maxim Uvarov <maxim.uva...@linaro.org>
>>>>> wrote:
>>>>>>
>>>>>> platform/linux-generic/odp_schedule_scalable.c
>>>>>>
>>>>>> static odp_schedule_group_t schedule_group_create(const char *name,
>>>>>>                                                   const odp_thrmask_t
>>>>>> *mask)
>>>>>> {
>>>>>>
>>>>>> ......
>>>>>>
>>>>>>         printf("%s()%d\n", __func__, __LINE__); <-- prints
>>>>>>         /* Validate inputs */
>>>>>>         if (mask == NULL)
>>>>>>                 ODP_ABORT("mask is NULL\n");
>>>>>>
>>>>>>         printf("%s()%d\n", __func__, __LINE__); <- prints
>>>>>>         odp_spinlock_lock(&sched_grp_lock);
>>>>>>
>>>>>>         printf("%s()%d\n", __func__, __LINE__);
>>>>>>         /* Allocate a scheduler group */
>>>>>>         free = atom_bitset_load(&sg_free, __ATOMIC_RELAXED);
>>>>>>         printf("%s()%d\n", __func__, __LINE__); <- not printed, hung
>>>>>> forever before this
>>>>>>
>>>>>> Maxim.
>>>>>>
>>>>>> On 13 March 2018 at 00:08, Bill Fischofer <bill.fischo...@linaro.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> That's interesting since it was developed by Arm and presumably
>>>>>>> tested
>>>>>>> by them on Arm systems.
>>>>>>>
>>>>>>> On Mon, Mar 12, 2018 at 4:58 PM, Maxim Uvarov
>>>>>>> <maxim.uva...@linaro.org> wrote:
>>>>>>> > I see that odp_init_global() fails on thunder-x with salable
>>>>>>> > scheduler.
>>>>>>> >
>>>>>>> > On 12 March 2018 at 23:57, Bill Fischofer
>>>>>>> > <bill.fischo...@linaro.org> wrote:
>>>>>>> >>
>>>>>>> >> Sure. Dmitry says it's a clang related failure. Is that what
>>>>>>> >> you're
>>>>>>> >> seeing? If it's related to a specific level of clang we may be
>>>>>>> >> able to
>>>>>>> >> simply document it as such.
>>>>>>> >>
>>>>>>> >> On Mon, Mar 12, 2018 at 4:25 PM, Maxim Uvarov
>>>>>>> >> <maxim.uva...@linaro.org>
>>>>>>> >> wrote:
>>>>>>> >> > Bill,
>>>>>>> >> >
>>>>>>> >> > I reproduced fail on thunder-x. So I would like to take a look
>>>>>>> >> > at it one
>>>>>>> >> > more day before doing rc2.
>>>>>>> >> >
>>>>>>> >> > Maxim.
>>>>>>> >
>>>>>>> >
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



-- 
With best wishes
Dmitry

Reply via email to