Dear hpx user list,
one of our projects shows unexpectedly bad speedups when
supplying additional OS-worker-threads to HPX.
The project is run locally and in parallel on a machine
with 8 cores, trying to pin down the parallelization
bottleneck we printed the built in HPX Performance
Counters as seen below.
The parallelization is achieved by scheduling tasks with
hpx::apply that themselves will schedule additional tasks
with hpx::apply.
The program terminates after a final task (that can
identify itself and will always finish last, independent
of task scheduling order) fires an event.
Synchronization is performed with some
hpx::lcos::local::mutex locks.
The problem seems to be apparent when looking at the
harshly growing cumulative-overhead per worker-thread when
employing more OS threads.
However we are a bit clueless as to interpret the meaning
of this cumulative-overhead counter.
We were especially surprised to find, that the
per-worker-thread overhead at some point came close to and
even surpassed the total cumulative runtime (see
cumulative overhead of worker thread 0 when run with 8 os
threads vs. total cumulative runtime).
What exactly does the performance counter
/threads/time/cumulative-overhead measure? How can the
overhead be larger than the total execution time?
How could we narrow down the causes for the growing
overhead? For example how could we measure how much time
is spend waiting at (specific) mutexes in total?
Thanks in advance,
Kilian Werner
--hpx:threads 1:
/threads{locality#0/total/total}/count/cumulative,1,2.015067,[s],127704
/threads{locality#0/total/total}/time/average,1,2.015073,[s],14938,[ns]
/threads{locality#0/total/total}/time/cumulative,1,2.015074,[s],1.90769e+09,[ns]
/threads{locality#0/total/total}/time/cumulative-overhead,1,2.015076,[s],1.03483e+08,[ns]
/threads{locality#0/pool#default/worker-thread#0}/time/cumulative-overhead,1,2.015076,[s],1.03483e+08,[ns]
/threads{locality#0/total/total}/idle-rate,1,2.015078,[s],514,[0.01%]
--hpx:threads 2:
/threads{locality#0/total/total}/count/cumulative,1,1.814639,[s],112250
/threads{locality#0/total/total}/time/average,1,1.814644,[s],17986,[ns]
/threads{locality#0/total/total}/time/cumulative,1,1.814654,[s],2.01907e+09,[ns]
/threads{locality#0/total/total}/time/cumulative-overhead,1,1.814647,[s],1.60469e+09,[ns]
/threads{locality#0/pool#default/worker-thread#0}/time/cumulative-overhead,1,1.814599,[s],1.12562e+09,[ns]
/threads{locality#0/pool#default/worker-thread#1}/time/cumulative-overhead,1,1.814649,[s],4.79071e+08,[ns]
/threads{locality#0/total/total}/idle-rate,1,1.814603,[s],4428,[0.01%]
--hpx:threads 8:
/threads{locality#0/total/total}/count/cumulative,1,4.597361,[s],109476
/threads{locality#0/total/total}/time/average,1,4.597373,[s],37988,[ns]
/threads{locality#0/total/total}/time/cumulative,1,4.597335,[s],4.1588e+09,[ns]
/threads{locality#0/total/total}/time/cumulative-overhead,1,4.597325,[s],3.25232e+10,[ns]
/threads{locality#0/pool#default/worker-thread#0}/time/cumulative-overhead,1,4.597408,[s],4.20735e+09,[ns]
/threads{locality#0/pool#default/worker-thread#1}/time/cumulative-overhead,1,4.597390,[s],4.08787e+09,[ns]
/threads{locality#0/pool#default/worker-thread#2}/time/cumulative-overhead,1,4.597385,[s],3.62298e+09,[ns]
/threads{locality#0/pool#default/worker-thread#3}/time/cumulative-overhead,1,4.597358,[s],4.12475e+09,[ns]
/threads{locality#0/pool#default/worker-thread#4}/time/cumulative-overhead,1,4.597338,[s],4.10011e+09,[ns]
/threads{locality#0/pool#default/worker-thread#5}/time/cumulative-overhead,1,4.597402,[s],4.14242e+09,[ns]
/threads{locality#0/pool#default/worker-thread#6}/time/cumulative-overhead,1,4.597353,[s],4.13593e+09,[ns]
/threads{locality#0/pool#default/worker-thread#7}/time/cumulative-overhead,1,4.597408,[s],4.13275e+09,[ns]
/threads{locality#0/total/total}/idle-rate,1,4.597350,[s],8867,[0.01%]
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users