> -----Original Message----- > From: Van Haaren, Harry <[email protected]> > Sent: Friday, April 3, 2020 7:58 PM > To: Phil Yang <[email protected]>; [email protected]; Ananyev, > Konstantin <[email protected]>; > [email protected]; [email protected]; > [email protected] > Cc: [email protected]; [email protected]; > [email protected]; Honnappa Nagarahalli > <[email protected]>; Gavin Hu <[email protected]>; > Ruifeng Wang <[email protected]>; Joyce Kong > <[email protected]>; nd <[email protected]> > Subject: RE: [PATCH v3 11/12] service: optimize with c11 one-way barrier > > > -----Original Message----- > > From: Phil Yang <[email protected]> > > Sent: Tuesday, March 17, 2020 1:18 AM > > To: [email protected]; Van Haaren, Harry > <[email protected]>; > > Ananyev, Konstantin <[email protected]>; > > [email protected]; [email protected]; > [email protected] > > Cc: [email protected]; [email protected]; > [email protected]; > > [email protected]; [email protected]; > [email protected]; > > [email protected]; [email protected] > > Subject: [PATCH v3 11/12] service: optimize with c11 one-way barrier > > > > The num_mapped_cores and execute_lock are synchronized with > rte_atomic_XX > > APIs which is a full barrier, DMB, on aarch64. This patch optimized it with > > c11 atomic one-way barrier. > > > > Signed-off-by: Phil Yang <[email protected]> > > Reviewed-by: Ruifeng Wang <[email protected]> > > Reviewed-by: Gavin Hu <[email protected]> > > Reviewed-by: Honnappa Nagarahalli <[email protected]> > > Based on discussion on-list, it seems the consensus is to not use > GCC builtins, but instead use C11 APIs "proper"? If my conclusion is > correct, the v+1 of this patchset would require updates to that style API. > > Inline comments for context below, -Harry > > > > --- > > lib/librte_eal/common/rte_service.c | 50 > ++++++++++++++++++++++++++---------- > > - > > 1 file changed, 35 insertions(+), 15 deletions(-) > > > > diff --git a/lib/librte_eal/common/rte_service.c > > b/lib/librte_eal/common/rte_service.c > > index 0843c3c..c033224 100644 > > --- a/lib/librte_eal/common/rte_service.c > > +++ b/lib/librte_eal/common/rte_service.c > > @@ -42,7 +42,7 @@ struct rte_service_spec_impl { > > * running this service callback. When not set, a core may take the > > * lock and then run the service callback. > > */ > > - rte_atomic32_t execute_lock; > > + uint32_t execute_lock; > > > > /* API set/get-able variables */ > > int8_t app_runstate; > > @@ -54,7 +54,7 @@ struct rte_service_spec_impl { > > * It does not indicate the number of cores the service is running > > * on currently. > > */ > > - rte_atomic32_t num_mapped_cores; > > + int32_t num_mapped_cores; > > Any reason why "int32_t" or "uint32_t" is used over another? > execute_lock is a uint32_t above, num_mapped_cores is an int32_t?
It should be uint32_t for num_mapped_cores. This value will not be negative after __atomic_sub_fetch operation, because of the sequence of writer and reader accesses are guaranteed by the memory ordering. I will update it in the next version. Thanks, Phil <snip>

