Hi HPX experts,

Once in a while I try to generate OTF2 traces for distributed HPX runs. 
This has never worked for me. A year ago, I asked this mailing list for 
help. Even using the helpful replies I could not get it to work, and I 
moved on.

But being able to generate these traces would actually be very useful 
for me, so I tried it yet another time, with the current HPX 1.7.1.

I wonder whether anybody is actually doing this as well, and succeeding. 
If so, can you please explain how?

Generating a trace for a single process works fine. Generating a trace 
for multiple processes (8) on the same node(!) fails:

OTF2 Error: INVALID, Unknown error code 

[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
OTF2 Error: INVALID, Unknown error code 

OTF2 Error: INVALID, Unknown error code 

[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
OTF2 Error: INVALID, Unknown error code 

[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
OTF2 Error: INVALID, Unknown error code 

[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
OTF2 Error: INVALID, Unknown error code 

[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
[OTF2] src/otf2_archive_int.c:1108: error: Unknown error code: Couldn't 
create directories on root.
OTF2 Error: INVALID, Unknown error code

BTW, I always get the messages below as well. I am not sure whether this 
is relevant. As per reply by Harmut (April 19, this list), I am ignoring 
these messages.

hpx::init: command line warning: --hpx:localities used when running with 
SLURM, requesting a different number of localities (8) than have been 
assigned by SLURM (1), the application might not run properly. 
 

...
repeated 8 time
...

Hartmut writes:

"We have never been able to properly figure that one out. In my 
experience, you can ignore the warning (as long as the binding 
information looks correct)."

In his reply to my question last year John B writes (Sep 10 2020):

"This reminds me of the error that is produced when all ranks think 
they're rank 0 - so apex is not getting the correct initialization info. 
(Ranks 1-N-1 try to create the otf files and clobber each other)"

I have no idea, but maybe these things are related?

Thanks for any info!

Kor
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to