It's not too late to do a check, though, to see if all other nodes have the same OMP_NUM_THREADS value. Maybe that's the warning? It sounds like it should be an error.
--Steve On 12/8/2022 5:23 PM, Erik Schnetter wrote: > Steve > > Code that runs as part of the Cactus executable is running too late > for this. At that time, OpenMP has already been initialized. > > There is the environment variable "CACTUS_NUM_THREADS" which is > checked at run time, but only if it is set (for backward > compatibility). Most people do not bother setting it, leaving this > error undetected. There is a warning output, but these are generally > ignored. > > -erik > > On Thu, Dec 8, 2022 at 3:48 PM Steven R. Brandt <[email protected]> wrote: >> We could probably add some startup code in which MPI broadcasts the >> OMP_NUM_THREADS setting to all the other processes and either checks the >> value of the environment variable or calls omp_set_num_threads() or some >> such. >> >> --Steve >> >> On 12/8/2022 9:03 AM, Erik Schnetter wrote: >>> Spandan >>> >>> The problem is likely that MPI does not automatically forward your >>> OpenMP setting to the other nodes. You are setting the environment >>> variable OMP_NUM_THREADS in the run script, and it is likely necessary >>> to forward this environment variable to the other processes as well. >>> Your MPI documentation will tell you how to do this. This is likely an >>> additional option you need to pass when calling "mpirun". >>> >>> -erik >>> >>> On Thu, Dec 8, 2022 at 2:50 AM Spandan Sarma 19306 >>> <[email protected]> wrote: >>>> Hello, >>>> >>>> >>>> This mail is in continuation to the ticket, “Issue with compiling ET on >>>> cluster”, by Shamim. >>>> >>>> >>>> So after Roland’s suggestion, we found that using the –prefix >>>> <openmpi-directory> command along with hostfile worked successfully in >>>> simulating a multiple node simulation in our HPC. >>>> >>>> >>>> Now we find that the BNSM gallery simulation evolves for only 240 >>>> iterations on 2 nodes (16+16 procs, 24 hr walltime), which is very slow >>>> with respect to, simulation on 1 node (16 procs, 24 hr walltime) evolved >>>> for 120988 iterations. The parallelization process goes well within 1 >>>> node, we received iterations - 120988, 67756, 40008 for procs - 16, 8, 4 >>>> (24 hr walltime) respectively. We are unable to understand what is causing >>>> this issue when openmpi is given 2 nodes (16+16 procs). >>>> >>>> >>>> In the output files we found the following, which may be an indication >>>> towards the issue: >>>> >>>> IINFO (Carpet): MPI is enabled >>>> >>>> INFO (Carpet): Carpet is running on 32 processes >>>> >>>> INFO (Carpet): This is process 0 >>>> >>>> INFO (Carpet): OpenMP is enabled >>>> >>>> INFO (Carpet): This process contains 1 threads, this is thread 0 >>>> >>>> INFO (Carpet): There are 144 threads in total >>>> >>>> INFO (Carpet): There are 4.5 threads per process >>>> >>>> INFO (Carpet): This process runs on host n129, pid=20823 >>>> >>>> INFO (Carpet): This process runs on 1 core: 0 >>>> >>>> INFO (Carpet): Thread 0 runs on 1 core: 0 >>>> >>>> INFO (Carpet): This simulation is running in 3 dimensions >>>> >>>> INFO (Carpet): Boundary specification for map 0: >>>> >>>> nboundaryzones: [[3,3,3],[3,3,3]] >>>> >>>> is_internal : [[0,0,0],[0,0,0]] >>>> >>>> is_staggered : [[0,0,0],[0,0,0]] >>>> >>>> shiftout : [[1,0,1],[0,0,0]] >>>> >>>> WARNING level 1 from host n131 process 21 >>>> >>>> in thorn Carpet, file >>>> /home2/mallick/ET9/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:426: >>>> >>>> -> The number of threads for this process is larger its number of >>>> cores. This may indicate a performance problem. >>>> >>>> >>>> This is something that we couldn’t understand as we asked for only 32 >>>> procs, with num-threads set to 1. The command that we used to submit our >>>> job was: >>>> >>>> ./simfactory/bin/sim create-submit p32_mpin_npn --procs=32 --ppn=16 >>>> --num-threads=1 --ppn-used=16 --num-smt=1 --parfile=par/nsnstohmns1.par >>>> --walltime=24:10:00 >>>> >>>> >>>> I have attached the out file, runscript, submitscript, optionlist, machine >>>> file for reference. Thanks in advance for help. >>>> >>>> >>>> Sincerely, >>>> >>>> -- >>>> Spandan Sarma >>>> BS-MS' 19 >>>> Department of Physics (4th Year), >>>> IISER Bhopal >>>> _______________________________________________ >>>> Users mailing list >>>> [email protected] >>>> http://lists.einsteintoolkit.org/mailman/listinfo/users >>> >> _______________________________________________ >> Users mailing list >> [email protected] >> http://lists.einsteintoolkit.org/mailman/listinfo/users > > _______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
