<snip>

> > Subject: [PATCH v3 10/12] service: identify service running on another
> > core correctly
> >
> > From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> >
> > The logic to identify if the MT unsafe service is running on another
> > core can return -EBUSY spuriously. In such cases, running the service
> > becomes costlier than using atomic operations. Assume that the
> > application passes the right parameters and reduces the number of
> > instructions for all cases.
> >
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > Reviewed-by: Phil Yang <phil.y...@arm.com>
> > Reviewed-by: Gavin Hu <gavin...@arm.com>
> 
> Is this fixing broken functionality, or does it aim to only "optimize"?
> Lack of "fixes" tag suggests optimization.
Good point, it is missing the 'Fixes' tag. Relooking at the code, following are 
few problems that I observe, please correct me if I am wrong:
1) Looking at the API rte_service_map_lcore_set, 'num_mapped_cores' keeps track 
of the number of cores the service is mapped to. However, in the function 
'rte_service_run_iter_on_app_lcore', the 'num_mapped_cores' is incremented only 
if 'serialize_mt_unsafe' is set. This will return incorrect result in 
'rte_service_runstate_get' API when 'serialize_mt_unsafe' is not set.

2) Since incrementing 'num_mapped_cores' and checking its value is not an 
atomic operation, it introduces race conditions. Consider the current code 
snippet from the function 'rte_service_run_iter_on_app_lcore'.

1) if (serialize_mt_unsafe)
2)              rte_atomic32_inc(&s->num_mapped_cores);
3) if (service_mt_safe(s) == 0 && rte_atomic32_read(&s->num_mapped_cores) > 1) {
4)              if (serialize_mt_unsafe)
5)                      rte_atomic32_dec(&s->num_mapped_cores);
6)              return -EBUSY;
7)      }

It is possible that more than 1 thread is executing line 3) concurrently, in 
which case all of them will hit line 6). Due to this, it is possible that the 
service might never run or it might take multiple attempts to run the service 
wasting cycles.

If you agree these are bugs, I can add 'fixes' tag.

> 
> I'm cautious about the commit phrase "Assume that the application ...", if the
> code was previously checking things, we must not stop checking them now,
> this may introduce race-conditions in existing applications?
What I meant here is, let us believe in what the application says. i.e. if the 
applications sets 'serialize_mt_unsafe' to 1, let us assume that the service is 
infact 'unsafe'.

> 
> It seems like the "serialize_mt_unsafe" branch is being pushed further down
> the callgraph, and instead of branching over atomics this patch forces always
> executing 2 atomics?
Yes, that is correct. Please see explanation in 1) above.

> 
> This feels like too specific an optimization/tradeoff, without data to backup
> that there are no regressions on any DPDK supported platforms.
Apologies for missing the 'Fixes' tag, this patch is a bug fix.

> 
> DPDK today doesn't have a micro-benchmark to gather such perf data, but I
> would welcome one and we can have a data-driven decision.
When this was developed, was the logic to avoid taking the lock measured to 
provide any improvements?

> 
> Hope this point-of-view makes sense, -Harry
> 
> > ---
> >  lib/librte_eal/common/rte_service.c | 26 ++++++++------------------
> >  1 file changed, 8 insertions(+), 18 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/rte_service.c
> > b/lib/librte_eal/common/rte_service.c
> > index 32a2f8a..0843c3c 100644
> > --- a/lib/librte_eal/common/rte_service.c
> > +++ b/lib/librte_eal/common/rte_service.c
> > @@ -360,7 +360,7 @@ service_runner_do_callback(struct
> > rte_service_spec_impl *s,
> >  /* Expects the service 's' is valid. */  static int32_t
> > service_run(uint32_t i, struct core_state *cs, uint64_t service_mask,
> > -       struct rte_service_spec_impl *s)
> > +       struct rte_service_spec_impl *s, uint32_t serialize_mt_unsafe)
> >  {
> >     if (!s)
> >             return -EINVAL;
> > @@ -374,7 +374,7 @@ service_run(uint32_t i, struct core_state *cs,
> > uint64_t service_mask,
> >
> >     cs->service_active_on_lcore[i] = 1;
> >
> > -   if (service_mt_safe(s) == 0) {
> > +   if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) {
> >             if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1))
> >                     return -EBUSY;
> >
> > @@ -412,24 +412,14 @@ rte_service_run_iter_on_app_lcore(uint32_t id,
> > uint32_t
> > serialize_mt_unsafe)
> >
> >     SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
> >
> > -   /* Atomically add this core to the mapped cores first, then examine if
> > -    * we can run the service. This avoids a race condition between
> > -    * checking the value, and atomically adding to the mapped count.
> > +   /* Increment num_mapped_cores to indicate that the service
> > +    * is running on a core.
> >      */
> > -   if (serialize_mt_unsafe)
> > -           rte_atomic32_inc(&s->num_mapped_cores);
> > +   rte_atomic32_inc(&s->num_mapped_cores);
> >
> > -   if (service_mt_safe(s) == 0 &&
> > -                   rte_atomic32_read(&s->num_mapped_cores) > 1) {
> > -           if (serialize_mt_unsafe)
> > -                   rte_atomic32_dec(&s->num_mapped_cores);
> > -           return -EBUSY;
> > -   }
> > -
> > -   int ret = service_run(id, cs, UINT64_MAX, s);
> > +   int ret = service_run(id, cs, UINT64_MAX, s, serialize_mt_unsafe);
> >
> > -   if (serialize_mt_unsafe)
> > -           rte_atomic32_dec(&s->num_mapped_cores);
> > +   rte_atomic32_dec(&s->num_mapped_cores);
> >
> >     return ret;
> >  }
> > @@ -449,7 +439,7 @@ service_runner_func(void *arg)
> >                     if (!service_valid(i))
> >                             continue;
> >                     /* return value ignored as no change to code flow */
> > -                   service_run(i, cs, service_mask, service_get(i));
> > +                   service_run(i, cs, service_mask, service_get(i), 1);
> >             }
> >
> >             cs->loops++;
> > --
> > 2.7.4

Reply via email to