Kor,

I think you've just found a bug (all my fault...) at a very good time. There 
are indeed configuration options that affect how the scheduler steals tasks and 
it looks like I've set them to very inappropriate values. Stay tuned for a PR.


On the second point you're probably seeing your hpx_main/main function as 
run_helper. Because of some changes in APEX the task names in the OTF file get 
the first name of the task. We don't have a good solution for this yet. As a 
hack you could try doing this as the first thing in your main function (note: 
that's main if you include hpx_main.hpp or hpx_main if you include hpx_init.hpp 
or hpx_start.hpp):


hpx::util::annotate_function annotation("my_main");

hpx::this_thread::yield();


Once the task is rescheduled it should have the label "my_main".


Mikael

________________________________
From: hpx-users-boun...@stellar.cct.lsu.edu 
<hpx-users-boun...@stellar.cct.lsu.edu> on behalf of Jong, K. de (Kor) 
<k.dejo...@uu.nl>
Sent: Wednesday, December 18, 2019 3:20:59 PM
To: hpx-users@stellar.cct.lsu.edu
Subject: Re: [hpx-users] parallel_executor::post dominating APEX trace

Hi Mikael,

Thank you for your detailed and helpful answer! It starts to make sense
to me now. My program almost behaves as I expect it should. I still have
two questions though. Maybe you or someone else can point me to the
right direction?

1. I run my program on a single node and expect that all threads receive
about the same number of same-sized tasks. The APEX trace in Vampir
shows that all threads start busy but after a while gaps appear on some
OS threads during which nothing seems to happen, while other threads are
still performing tasks. I would expect tasks to be more evenly
distributed and/or to be stolen from the task queues of other OS
threads. Is this assumption correct? Can I increase the tendency of the
scheduler to steal tasks to keep OS threads busy?

2. I perform scaling tests, and each time my tasks run in parallel there
is a serial 'run_helper' task that runs on a single OS thread. What is
this and can I somehow keep it out of my timings? Based on a quick look
at the HPX code I concluded that run_helper has to do with initializing
the HPX run time. But even if I run my tasks multiple times (from the
same running process), run_helper spends time before my tasks do.

I focus on a single node now (1-96 OS threads) and am not doing anything
too clever, I think. I don't tweak the bindings and scheduler yet.

Thanks!

Kor
_______________________________________________
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
_______________________________________________
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to