On 2016-10-19 10:59:33 [-0700], Davidlohr Bueso wrote:
> Sebastian noted that overhead for worker thread ops (throughput)
> accounting was producing 'perf' to appear in the profiles, consuming
> a non-trivial (ie 13%) amount of CPU. This is due to cacheline
> bouncing due to the increment of w->ops. We can easily fix this by
> just working on a local copy and updating the actual worker once
> done running, and ready to show the program summary. There is no
> danger of the worker being concurrent, so we can trust that no stale
> value is being seen by another thread.
> 
> Reported-by: Sebastian Andrzej Siewior <bige...@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bige...@linutronix.de>

> --- a/tools/perf/bench/futex-hash.c
> +++ b/tools/perf/bench/futex-hash.c
> @@ -63,8 +63,9 @@ static const char * const bench_futex_hash_usage[] = {
> static void *workerfn(void *arg)
> {
>       int ret;
> -     unsigned int i;
>       struct worker *w = (struct worker *) arg;
> +     unsigned int i;
> +     unsigned long ops = w->ops; /* avoid cacheline bouncing */

we start at 0 so there is probably no need to init it with w->ops.

Sebastian

Reply via email to