Kor - Well, those messages may still be legitimate - they could be related to clock synchronization (a receive happening “before” a send), and there appear to be some un-matched messages either during initialization or finalization (I assume). Unfortunately, Vampir doesn’t give any more information, and otf2-print doesn’t report that anything is wrong with the trace. I recently added clock synchronization for MPI applications, I need to do the same for HPX applications.
The range violation happens when there is one event after the reported “end” timestamp of the trace. I take a timestamp during the “shutdown” step of APEX, in case the post-processing takes a long time (which can happen with asynchronous CUDA/HIP activity processing) I don’t want it to dilate the total trace time. At any rate, there appears to be a race condition in shutdown that allows events to happen after I have taken that last timestamp. I’m glad you find the trace output useful! The APEX + HPX integration has been a long-running collaboration between LSU and UO. Thanks - Kevin > On Sep 16, 2021, at 10:53 PM, Kor de Jong <[email protected]> wrote: > > Hi Kevin, > > Thank you for your explanation. I now better understand how I should read the > Vampir trace. > > You say "I assume 1 process per physical node". My trace involved 8 processes > on a single node. Maybe that explains the messages Vampir throws at me: > > >> Event matching irregular - 79624 in total > >> Pending messages - 120 in total > >> Range violation - 1 in total > > I will try again with using a single process per node -- using multiple nodes. > > BTW, being able to trace tasks this way is very useful! I don't know who is / > are responsible for the APEX + HPX integration, but I think it is great. > > Best regards, > Kor > > > On 9/17/21 12:09 AM, Kevin Huck wrote: >> Kor - >> Sorry I didn’t reply sooner… I’m glad things are working for you now! >> The thread naming is a bit odd, because Vampir changed how they display the >> names (from version 8 to version 9, I think), and I haven’t really tried >> that hard to make sure that the names accurately reflect the physical >> hardware. But the process/thread hierarchy is correct, even if the naming >> looks odd. For example: >> CPU thread 1:1 - this is the main thread of the program, although HPX >> doesn’t use it to execute tasks. APEX attributes all communication to this >> thread. >> CPU thread 2:1 - this is the first worker thread spawned by HPX. >> CPU thread 4:1 - this is the second worker thread spawned by HPX. APEX has >> numbered it oddly, because thread 3 is internal to APEX. >> CPU thread 5:1 - etc. >> CPU thread 6:1 >> CPU thread 7:1 >> CPU thread 8:1 >> I hope that explains things. I don’t use “hwloc” or any library like that >> to construct a perfectly accurate system hardware hierarchy, because it >> hasn’t been worth the effort. For the tracing, I assume 1 process per >> physical node, and as long as the OS processes and threads are annotated, it >> works. Just wait until you see how I annotate the GPU threads… 🙃 (it’s >> actually not that bad) >> Thanks - >> Kevin >>> On Sep 6, 2021, at 10:55 AM, Kor de Jong <[email protected] >>> <mailto:[email protected]><mailto:[email protected] <mailto:[email protected]>>> >>> wrote: >>> >>> [I sent the message below to the HPX mailinglist and forgot to cc you.] >>> >>> >>> Hi Kevin, >>> >>> On 9/3/21 6:26 PM, Kevin Huck wrote: >>>> Most versions of OTF2 (2.2 and lower, I believe) had an uninitialized >>>> variable that sometimes led to this error message and prematurely exited >>>> the initialization process, leading to other problems. Which version of >>>> OTF2 are you using? >>> >>> I used 2.2 but now switched to 2.3 and I applied this patch: >>> >>> --- src/otf2_archive_int.c-org»·2021-09-06 11:27:07.439272261 +0200 >>> +++ src/otf2_archive_int.c»·2021-09-06 11:28:15.735032626 +0200 >>> @@ -1083,7 +1083,7 @@ >>> archive->global_comm_context = globalCommContext; >>> archive->local_comm_context = localCommContext; >>> >>> - OTF2_ErrorCode status; >>> + OTF2_ErrorCode status = OTF2_SUCCESS; >>> >>> /* It is time to create the directories by the root rank. */ >>> if ( archive->file_mode == OTF2_FILEMODE_WRITE ) >>> >>> This got rid of the error message, and a trace is now being generated. >>> Great! But I wonder whether the trace is correct. Vampir reports: >>> >>> Event matching irregular - 79624 in total >>> Pending messages - 120 in total >>> Range violation - 1 in total >>> >>> I posted a screenshot of the trace here: >>> >>> https://surfdrive.surf.nl/files/index.php/s/MWbhZFPv733tgMX >>> <https://surfdrive.surf.nl/files/index.php/s/MWbhZFPv733tgMX><https://surfdrive.surf.nl/files/index.php/s/MWbhZFPv733tgMX >>> <https://surfdrive.surf.nl/files/index.php/s/MWbhZFPv733tgMX>> >>> >>> I see 8 nested groups of 6 CPU threads, which is good. The numbering / >>> labeling is weird though. Each group of 6 CPUs is a process running on a >>> NUMA node. >>> >>>> It’s possible I have the wrong SLURM environment variables. Could you >>>> please do something like the following on your system (with a small test >>>> case) and see what you get? >>>> `srun <srun arguments> env | grep SLURM` >>> >>> My goal is to trace a job with HPX 8 processes on a single node. This node >>> contains 8 NUMA nodes, each containing 6 real cores. >>> >>> salloc --partition=allq --nodes=1 --ntasks=8 --cpus-per-task=12 >>> --cores-per-socket=6 env | grep SLURM >>> >>> SLURM_SUBMIT_DIR=/quanta1/home/jong0137/development/project/lue >>> SLURM_SUBMIT_HOST=login01.cluster >>> SLURM_JOB_ID=3429945 >>> SLURM_JOB_NAME=env >>> SLURM_JOB_NUM_NODES=1 >>> SLURM_JOB_NODELIST=node008 >>> SLURM_NODE_ALIASES=(null) >>> SLURM_JOB_PARTITION=allq >>> SLURM_JOB_CPUS_PER_NODE=96 >>> SLURM_JOBID=3429945 >>> SLURM_NNODES=1 >>> SLURM_NODELIST=node008 >>> SLURM_TASKS_PER_NODE=8 >>> SLURM_JOB_ACCOUNT=depfg >>> SLURM_JOB_QOS=depfg >>> SLURM_NTASKS=8 >>> SLURM_NPROCS=8 >>> SLURM_CPUS_PER_TASK=12 >>> SLURM_CLUSTER_NAME=cluster >>> >>> >>> I use mpirun to start my HPX program. Not use if this is useful, but these >>> are the MPI variables set: >>> >>> Each of the 8 processes prints these same values: >>> >>> OMPI_APP_CTX_NUM_PROCS=8 >>> OMPI_COMM_WORLD_LOCAL_SIZE=8 >>> OMPI_COMM_WORLD_SIZE=8 >>> OMPI_FIRST_RANKS=0 >>> OMPI_UNIVERSE_SIZE=8 >>> >>> These are different per each of the 8 processes: >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=0 >>> OMPI_COMM_WORLD_NODE_RANK=0 >>> OMPI_COMM_WORLD_RANK=0 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=1 >>> OMPI_COMM_WORLD_NODE_RANK=1 >>> OMPI_COMM_WORLD_RANK=1 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=2 >>> OMPI_COMM_WORLD_NODE_RANK=2 >>> OMPI_COMM_WORLD_RANK=2 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=3 >>> OMPI_COMM_WORLD_NODE_RANK=3 >>> OMPI_COMM_WORLD_RANK=3 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=4 >>> OMPI_COMM_WORLD_NODE_RANK=4 >>> OMPI_COMM_WORLD_RANK=4 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=5 >>> OMPI_COMM_WORLD_NODE_RANK=5 >>> OMPI_COMM_WORLD_RANK=5 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=6 >>> OMPI_COMM_WORLD_NODE_RANK=6 >>> OMPI_COMM_WORLD_RANK=6 >>> >>> OMPI_COMM_WORLD_LOCAL_RANK=7 >>> OMPI_COMM_WORLD_NODE_RANK=7 >>> OMPI_COMM_WORLD_RANK=7 >>> >>> >>> Thanks for looking into this! >>> >>> Kor >>> >> -- >> Kevin Huck, PhD >> Research Associate / Computer Scientist >> OACISS - Oregon Advanced Computing Institute for Science and Society >> University of Oregon >> [email protected] <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>> >> http://tau.uoregon.edu <http://tau.uoregon.edu/> >> http://oaciss.uoregon.edu <http://oaciss.uoregon.edu/> >> <http://oaciss.uoregon.edu <http://oaciss.uoregon.edu/>> -- Kevin Huck, PhD Research Associate / Computer Scientist OACISS - Oregon Advanced Computing Institute for Science and Society University of Oregon [email protected] http://tau.uoregon.edu http://oaciss.uoregon.edu
_______________________________________________ hpx-users mailing list [email protected] https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
