> From: Van Haaren, Harry [mailto:harry.van.haa...@intel.com] > Sent: Friday, 8 July 2022 15.45 > > > From: Morten Brørup <m...@smartsharesystems.com> > > Sent: Friday, July 8, 2022 2:23 PM > > <snip commit message, focus on performance data> > > > > This patch causes a 1.25x increase in cycle-cost for polling a > > > MT safe service when statistics are enabled. No change was seen > > > for MT unsafe services, or when statistics are disabled. > > > > > > Reported-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > > > Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > Suggested-by: Morten Brørup <m...@smartsharesystems.com> > > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> > > > > > > --- > > > > [...] > > > > > + if (service_mt_safe(s)) { > > > + __atomic_fetch_add(&s->cycles_spent, cycles, > > > __ATOMIC_RELAXED); > > > + __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED); > > > + } else { > > > + s->cycles_spent += cycles; > > > + s->calls++; > > > + } > > > > Have you considered the performance cost of the > > __atomic_fetch_add(__ATOMIC_RELAXED) versus the performance cost of > the > > branch to compare if the service is MT safe? It might be cheaper to > just always use > > the atomic addition. I don't know, just mentioning that the compare- > and-branch also > > has a cost. > > Great question! > > > I'm not familiar with the DPDK services library, so perhaps MT safe > and MT unsafe > > services are never mixed, in which case the branch will always take > the same path, > > so branch prediction will eliminate the cost of branching. > > MT safe & unsafe can be mixed yes, so you're right, there may be mis- > predicts. Note that > assuming a service is actually doing something useful, there's likely > quite a few branches > between each call.. so unknown how fresh/accurate the branch history > will be. > > The common case is for many services to be "mt unsafe" (think polling > an ethdev queue). > In this case, it is nice to try reduce cost. Given this is likely the > highest quantity of services, > we'd like the performance here to be reduced the least. The branch > method achieves that. > > I did benchmark the "always use atomic" case, and it caused a ~30cycle > hit in the "mt unsafe" case, > where the atomic is not required (and hence the performance hit can be > avoided by branching). > Given branch-misses are handled between 15-20 cycles (uarch dependent), > attempting to avoid the > atomic still makes sense from cycle-cost perspective too I think.. > > I did spend the morning benchmarking solutions (and hence the patch > split, > to allow easy benchmarking before & after), so thanks for asking! > > Regards, -Harry
Thank you for elaborating, Harry. I am impressed with the considerations you have put into this, and have no further concerns. Reviewed-by: Morten Brørup <m...@smartsharesystems.com>