Dear hpx user list,

one of our projects shows unexpectedly bad speedups when 
supplying additional OS-worker-threads to HPX.
The project is run locally and in parallel on a machine 
with 8 cores, trying to pin down the parallelization 
bottleneck we printed the built in HPX Performance 
Counters as seen below.
The parallelization is achieved by scheduling tasks with 
hpx::apply that themselves will schedule additional tasks 
with hpx::apply.
The program terminates after a final task (that can 
identify itself and will always finish last, independent 
of task scheduling order) fires an event.
Synchronization is performed with some 
hpx::lcos::local::mutex locks.

The problem seems to be apparent when looking at the 
harshly growing cumulative-overhead per worker-thread when 
employing more OS threads.
However we are a bit clueless as to interpret the meaning 
of this cumulative-overhead counter.
We were especially surprised to find, that the 
per-worker-thread overhead at some point came close to and 
even surpassed the total cumulative runtime (see 
cumulative overhead of worker thread 0 when run with 8 os 
threads  vs. total cumulative runtime).

What exactly does the performance counter 
/threads/time/cumulative-overhead measure? How can the 
overhead be larger than the total execution time?
How could we narrow down the causes for the growing 
overhead? For example how could we measure how much time 
is spend waiting at (specific) mutexes  in total?

Thanks in advance,

Kilian Werner



--hpx:threads 1:

/threads{locality#0/total/total}/count/cumulative,1,2.015067,[s],127704
/threads{locality#0/total/total}/time/average,1,2.015073,[s],14938,[ns]
/threads{locality#0/total/total}/time/cumulative,1,2.015074,[s],1.90769e+09,[ns]
/threads{locality#0/total/total}/time/cumulative-overhead,1,2.015076,[s],1.03483e+08,[ns]
/threads{locality#0/pool#default/worker-thread#0}/time/cumulative-overhead,1,2.015076,[s],1.03483e+08,[ns]
/threads{locality#0/total/total}/idle-rate,1,2.015078,[s],514,[0.01%]

--hpx:threads 2:

/threads{locality#0/total/total}/count/cumulative,1,1.814639,[s],112250
/threads{locality#0/total/total}/time/average,1,1.814644,[s],17986,[ns]
/threads{locality#0/total/total}/time/cumulative,1,1.814654,[s],2.01907e+09,[ns]
/threads{locality#0/total/total}/time/cumulative-overhead,1,1.814647,[s],1.60469e+09,[ns]
/threads{locality#0/pool#default/worker-thread#0}/time/cumulative-overhead,1,1.814599,[s],1.12562e+09,[ns]
/threads{locality#0/pool#default/worker-thread#1}/time/cumulative-overhead,1,1.814649,[s],4.79071e+08,[ns]
/threads{locality#0/total/total}/idle-rate,1,1.814603,[s],4428,[0.01%]

--hpx:threads 8:

/threads{locality#0/total/total}/count/cumulative,1,4.597361,[s],109476
/threads{locality#0/total/total}/time/average,1,4.597373,[s],37988,[ns]
/threads{locality#0/total/total}/time/cumulative,1,4.597335,[s],4.1588e+09,[ns]
/threads{locality#0/total/total}/time/cumulative-overhead,1,4.597325,[s],3.25232e+10,[ns]
/threads{locality#0/pool#default/worker-thread#0}/time/cumulative-overhead,1,4.597408,[s],4.20735e+09,[ns]
/threads{locality#0/pool#default/worker-thread#1}/time/cumulative-overhead,1,4.597390,[s],4.08787e+09,[ns]
/threads{locality#0/pool#default/worker-thread#2}/time/cumulative-overhead,1,4.597385,[s],3.62298e+09,[ns]
/threads{locality#0/pool#default/worker-thread#3}/time/cumulative-overhead,1,4.597358,[s],4.12475e+09,[ns]
/threads{locality#0/pool#default/worker-thread#4}/time/cumulative-overhead,1,4.597338,[s],4.10011e+09,[ns]
/threads{locality#0/pool#default/worker-thread#5}/time/cumulative-overhead,1,4.597402,[s],4.14242e+09,[ns]
/threads{locality#0/pool#default/worker-thread#6}/time/cumulative-overhead,1,4.597353,[s],4.13593e+09,[ns]
/threads{locality#0/pool#default/worker-thread#7}/time/cumulative-overhead,1,4.597408,[s],4.13275e+09,[ns]
/threads{locality#0/total/total}/idle-rate,1,4.597350,[s],8867,[0.01%]


_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to