> From: Van Haaren, Harry [mailto:harry.van.haa...@intel.com]
> Sent: Friday, 8 July 2022 15.45
> 
> > From: Morten Brørup <m...@smartsharesystems.com>
> > Sent: Friday, July 8, 2022 2:23 PM
> 
> <snip commit message, focus on performance data>
> 
> > > This patch causes a 1.25x increase in cycle-cost for polling a
> > > MT safe service when statistics are enabled. No change was seen
> > > for MT unsafe services, or when statistics are disabled.
> > >
> > > Reported-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
> > > Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > > Suggested-by: Morten Brørup <m...@smartsharesystems.com>
> > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>
> > >
> > > ---
> >
> > [...]
> >
> > > +         if (service_mt_safe(s)) {
> > > +                 __atomic_fetch_add(&s->cycles_spent, cycles,
> > > __ATOMIC_RELAXED);
> > > +                 __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED);
> > > +         } else {
> > > +                 s->cycles_spent += cycles;
> > > +                 s->calls++;
> > > +         }
> >
> > Have you considered the performance cost of the
> > __atomic_fetch_add(__ATOMIC_RELAXED) versus the performance cost of
> the
> > branch to compare if the service is MT safe? It might be cheaper to
> just always use
> > the atomic addition. I don't know, just mentioning that the compare-
> and-branch also
> > has a cost.
> 
> Great question!
> 
> > I'm not familiar with the DPDK services library, so perhaps MT safe
> and MT unsafe
> > services are never mixed, in which case the branch will always take
> the same path,
> > so branch prediction will eliminate the cost of branching.
> 
> MT safe & unsafe can be mixed yes, so you're right, there may be mis-
> predicts. Note that
> assuming a service is actually doing something useful, there's likely
> quite a few branches
> between each call.. so unknown how fresh/accurate the branch history
> will be.
> 
> The common case is for many services to be "mt unsafe" (think polling
> an ethdev queue).
> In this case, it is nice to try reduce cost. Given this is likely the
> highest quantity of services,
> we'd like the performance here to be reduced the least. The branch
> method achieves that.
> 
> I did benchmark the "always use atomic" case, and it caused a ~30cycle
> hit in the "mt unsafe" case,
> where the atomic is not required (and hence the performance hit can be
> avoided by branching).
> Given branch-misses are handled between 15-20 cycles (uarch dependent),
> attempting to avoid the
> atomic still makes sense from cycle-cost perspective too I think..
> 
> I did spend the morning benchmarking solutions (and hence the patch
> split,
> to allow easy benchmarking before & after), so thanks for asking!
> 
> Regards, -Harry

Thank you for elaborating, Harry. I am impressed with the considerations you 
have put into this, and have no further concerns.

Reviewed-by: Morten Brørup <m...@smartsharesystems.com>

Reply via email to