Re: [dpdk-dev] [PATCH v3 11/12] service: optimize with c11 one-way barrier

Phil Yang Wed, 08 Apr 2020 03:16:45 -0700

> -----Original Message-----
> From: Van Haaren, Harry <[email protected]>
> Sent: Friday, April 3, 2020 7:58 PM
> To: Phil Yang <[email protected]>; [email protected]; Ananyev,
> Konstantin <[email protected]>;
> [email protected]; [email protected];
> [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; Honnappa Nagarahalli
> <[email protected]>; Gavin Hu <[email protected]>;
> Ruifeng Wang <[email protected]>; Joyce Kong
> <[email protected]>; nd <[email protected]>
> Subject: RE: [PATCH v3 11/12] service: optimize with c11 one-way barrier
> 
> > -----Original Message-----
> > From: Phil Yang <[email protected]>
> > Sent: Tuesday, March 17, 2020 1:18 AM
> > To: [email protected]; Van Haaren, Harry
> <[email protected]>;
> > Ananyev, Konstantin <[email protected]>;
> > [email protected]; [email protected];
> [email protected]
> > Cc: [email protected]; [email protected];
> [email protected];
> > [email protected]; [email protected];
> [email protected];
> > [email protected]; [email protected]
> > Subject: [PATCH v3 11/12] service: optimize with c11 one-way barrier
> >
> > The num_mapped_cores and execute_lock are synchronized with
> rte_atomic_XX
> > APIs which is a full barrier, DMB, on aarch64. This patch optimized it with
> > c11 atomic one-way barrier.
> >
> > Signed-off-by: Phil Yang <[email protected]>
> > Reviewed-by: Ruifeng Wang <[email protected]>
> > Reviewed-by: Gavin Hu <[email protected]>
> > Reviewed-by: Honnappa Nagarahalli <[email protected]>
> 
> Based on discussion on-list, it seems the consensus is to not use
> GCC builtins, but instead use C11 APIs "proper"? If my conclusion is
> correct, the v+1 of this patchset would require updates to that style API.
> 
> Inline comments for context below, -Harry
> 
> 
> > ---
> >  lib/librte_eal/common/rte_service.c | 50
> ++++++++++++++++++++++++++----------
> > -
> >  1 file changed, 35 insertions(+), 15 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/rte_service.c
> > b/lib/librte_eal/common/rte_service.c
> > index 0843c3c..c033224 100644
> > --- a/lib/librte_eal/common/rte_service.c
> > +++ b/lib/librte_eal/common/rte_service.c
> > @@ -42,7 +42,7 @@ struct rte_service_spec_impl {
> >      * running this service callback. When not set, a core may take the
> >      * lock and then run the service callback.
> >      */
> > -   rte_atomic32_t execute_lock;
> > +   uint32_t execute_lock;
> >
> >     /* API set/get-able variables */
> >     int8_t app_runstate;
> > @@ -54,7 +54,7 @@ struct rte_service_spec_impl {
> >      * It does not indicate the number of cores the service is running
> >      * on currently.
> >      */
> > -   rte_atomic32_t num_mapped_cores;
> > +   int32_t num_mapped_cores;
> 
> Any reason why "int32_t" or "uint32_t" is used over another?
> execute_lock is a uint32_t above, num_mapped_cores is an int32_t?


It should be uint32_t for num_mapped_cores. 
This value will not be negative after __atomic_sub_fetch operation, because of 
the sequence of writer and reader accesses are guaranteed by the memory 
ordering.
I will update it in the next version.

Thanks,
Phil

<snip>

Re: [dpdk-dev] [PATCH v3 11/12] service: optimize with c11 one-way barrier

Reply via email to