> -----Original Message-----
> From: Morten Brørup <m...@smartsharesystems.com>
> Sent: Friday, July 8, 2022 2:23 PM
> To: Van Haaren, Harry <harry.van.haa...@intel.com>; dev@dpdk.org
> Cc: mattias.ronnblom <mattias.ronnb...@ericsson.com>; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>
> Subject: RE: [PATCH 2/2] service: fix potential stats race-condition on MT 
> services

<snip commit message, focus on performance data>

> > This patch causes a 1.25x increase in cycle-cost for polling a
> > MT safe service when statistics are enabled. No change was seen
> > for MT unsafe services, or when statistics are disabled.
> >
> > Reported-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
> > Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > Suggested-by: Morten Brørup <m...@smartsharesystems.com>
> > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>
> >
> > ---
> 
> [...]
> 
> > +           if (service_mt_safe(s)) {
> > +                   __atomic_fetch_add(&s->cycles_spent, cycles,
> > __ATOMIC_RELAXED);
> > +                   __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED);
> > +           } else {
> > +                   s->cycles_spent += cycles;
> > +                   s->calls++;
> > +           }
> 
> Have you considered the performance cost of the
> __atomic_fetch_add(__ATOMIC_RELAXED) versus the performance cost of the
> branch to compare if the service is MT safe? It might be cheaper to just 
> always use
> the atomic addition. I don't know, just mentioning that the 
> compare-and-branch also
> has a cost.

Great question!

> I'm not familiar with the DPDK services library, so perhaps MT safe and MT 
> unsafe
> services are never mixed, in which case the branch will always take the same 
> path,
> so branch prediction will eliminate the cost of branching.

MT safe & unsafe can be mixed yes, so you're right, there may be mis-predicts. 
Note that
assuming a service is actually doing something useful, there's likely quite a 
few branches
between each call.. so unknown how fresh/accurate the branch history will be.

The common case is for many services to be "mt unsafe" (think polling an ethdev 
queue).
In this case, it is nice to try reduce cost. Given this is likely the highest 
quantity of services,
we'd like the performance here to be reduced the least. The branch method 
achieves that.

I did benchmark the "always use atomic" case, and it caused a ~30cycle hit in 
the "mt unsafe" case,
where the atomic is not required (and hence the performance hit can be avoided 
by branching).
Given branch-misses are handled between 15-20 cycles (uarch dependent), 
attempting to avoid the
atomic still makes sense from cycle-cost perspective too I think..

I did spend the morning benchmarking solutions (and hence the patch split,
to allow easy benchmarking before & after), so thanks for asking!

Regards, -Harry

Reply via email to