Re: [lng-odp] [PATCH] linux-gen: barrier: Use correct memory ordering
Merged. Sorry for delay. Maxim. On 09/06/17 19:37, Brian Brooks wrote: > ping > > On Sat, Aug 26, 2017 at 9:52 AM, Bill Fischofer > wrote: >> On Sat, Aug 26, 2017 at 12:40 AM, Brian Brooks wrote: >> >>> Memory accesses that happen-before, in program order, a call to >>> odp_barrier_wait() cannot be reordered to after the call. Similarly, >>> memory accesses that happen-after, in program order, a call to >>> odp_barrier_wait() cannot be reordered to before the call. >>> >>> The current implementation of barriers uses sequentially consistent >>> fences on either side of odp_barrier_wait(). >>> >>> The correct memory ordering for barriers is release upon entering >>> odp_barrier_wait(), to prevent reordering to after the barrier, and >>> acquire upon exiting odp_barrier_wait(), to prevent reordering to >>> before the barrier. >>> >>> The measurable performance difference is negligible on weakly ordered >>> architectures such as ARM, so the highlight of this change is correctness. >>> >>> Signed-off-by: Brian Brooks >>> >> >> Reviewed-by: Bill Fischofer >> >> >>> --- >>> platform/linux-generic/odp_barrier.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/platform/linux-generic/odp_barrier.c >>> b/platform/linux-generic/odp_barrier.c >>> index 5eb354de..f70bdbf8 100644 >>> --- a/platform/linux-generic/odp_barrier.c >>> +++ b/platform/linux-generic/odp_barrier.c >>> @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier) >>> uint32_t count; >>> int wasless; >>> >>> - odp_mb_full(); >>> + odp_mb_release(); >>> >>> count = odp_atomic_fetch_inc_u32(&barrier->bar); >>> wasless = count < barrier->count; >>> @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier) >>> odp_cpu_pause(); >>> } >>> >>> - odp_mb_full(); >>> + odp_mb_acquire(); >>> } >>> -- >>> 2.14.1 >>> >>>
Re: [lng-odp] [PATCH] linux-gen: barrier: Use correct memory ordering
ping On Sat, Aug 26, 2017 at 9:52 AM, Bill Fischofer wrote: > On Sat, Aug 26, 2017 at 12:40 AM, Brian Brooks wrote: > >> Memory accesses that happen-before, in program order, a call to >> odp_barrier_wait() cannot be reordered to after the call. Similarly, >> memory accesses that happen-after, in program order, a call to >> odp_barrier_wait() cannot be reordered to before the call. >> >> The current implementation of barriers uses sequentially consistent >> fences on either side of odp_barrier_wait(). >> >> The correct memory ordering for barriers is release upon entering >> odp_barrier_wait(), to prevent reordering to after the barrier, and >> acquire upon exiting odp_barrier_wait(), to prevent reordering to >> before the barrier. >> >> The measurable performance difference is negligible on weakly ordered >> architectures such as ARM, so the highlight of this change is correctness. >> >> Signed-off-by: Brian Brooks >> > > Reviewed-by: Bill Fischofer > > >> --- >> platform/linux-generic/odp_barrier.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/platform/linux-generic/odp_barrier.c >> b/platform/linux-generic/odp_barrier.c >> index 5eb354de..f70bdbf8 100644 >> --- a/platform/linux-generic/odp_barrier.c >> +++ b/platform/linux-generic/odp_barrier.c >> @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier) >> uint32_t count; >> int wasless; >> >> - odp_mb_full(); >> + odp_mb_release(); >> >> count = odp_atomic_fetch_inc_u32(&barrier->bar); >> wasless = count < barrier->count; >> @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier) >> odp_cpu_pause(); >> } >> >> - odp_mb_full(); >> + odp_mb_acquire(); >> } >> -- >> 2.14.1 >> >>
Re: [lng-odp] [PATCH] linux-gen: barrier: Use correct memory ordering
On Sat, Aug 26, 2017 at 12:40 AM, Brian Brooks wrote: > Memory accesses that happen-before, in program order, a call to > odp_barrier_wait() cannot be reordered to after the call. Similarly, > memory accesses that happen-after, in program order, a call to > odp_barrier_wait() cannot be reordered to before the call. > > The current implementation of barriers uses sequentially consistent > fences on either side of odp_barrier_wait(). > > The correct memory ordering for barriers is release upon entering > odp_barrier_wait(), to prevent reordering to after the barrier, and > acquire upon exiting odp_barrier_wait(), to prevent reordering to > before the barrier. > > The measurable performance difference is negligible on weakly ordered > architectures such as ARM, so the highlight of this change is correctness. > > Signed-off-by: Brian Brooks > Reviewed-by: Bill Fischofer > --- > platform/linux-generic/odp_barrier.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/platform/linux-generic/odp_barrier.c > b/platform/linux-generic/odp_barrier.c > index 5eb354de..f70bdbf8 100644 > --- a/platform/linux-generic/odp_barrier.c > +++ b/platform/linux-generic/odp_barrier.c > @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier) > uint32_t count; > int wasless; > > - odp_mb_full(); > + odp_mb_release(); > > count = odp_atomic_fetch_inc_u32(&barrier->bar); > wasless = count < barrier->count; > @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier) > odp_cpu_pause(); > } > > - odp_mb_full(); > + odp_mb_acquire(); > } > -- > 2.14.1 > >
[lng-odp] [PATCH] linux-gen: barrier: Use correct memory ordering
Memory accesses that happen-before, in program order, a call to odp_barrier_wait() cannot be reordered to after the call. Similarly, memory accesses that happen-after, in program order, a call to odp_barrier_wait() cannot be reordered to before the call. The current implementation of barriers uses sequentially consistent fences on either side of odp_barrier_wait(). The correct memory ordering for barriers is release upon entering odp_barrier_wait(), to prevent reordering to after the barrier, and acquire upon exiting odp_barrier_wait(), to prevent reordering to before the barrier. The measurable performance difference is negligible on weakly ordered architectures such as ARM, so the highlight of this change is correctness. Signed-off-by: Brian Brooks --- platform/linux-generic/odp_barrier.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/platform/linux-generic/odp_barrier.c b/platform/linux-generic/odp_barrier.c index 5eb354de..f70bdbf8 100644 --- a/platform/linux-generic/odp_barrier.c +++ b/platform/linux-generic/odp_barrier.c @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier) uint32_t count; int wasless; - odp_mb_full(); + odp_mb_release(); count = odp_atomic_fetch_inc_u32(&barrier->bar); wasless = count < barrier->count; @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier) odp_cpu_pause(); } - odp_mb_full(); + odp_mb_acquire(); } -- 2.14.1