Hello, Reverting to srun fixes the problem. I updated the master branches for the testsuite results and simfactory.
Gabriele On Thu, Jun 2, 2022 at 9:12 AM Gabriele Bozzola <[email protected]> wrote: > Hi Roland, > > That sounds reasonable. I think I was originally using srun, but was > recommended > to move to ibrun. I will try with srun to see if it works, in which case I > will update the > simfactory entry and the testsuite results. > > Gabrieel > > On Thu, Jun 2, 2022 at 8:03 AM Roland Haas <[email protected]> wrote: > >> Hello Gabriele, >> >> ok, I can at least partially answer this. Indeed RNS's A2 test is code >> to use only 1 MPI rank: >> >> TEST rnsA2 >> { >> PROCS 1 >> } >> >> and thus the most likely reason is that ibrun just pulls the number of >> MPI ranks from SLURM rather than from whatever simfactory tries to use. >> >> Since ibrun is no longer documented on the SDSC page (at least I do not >> see it on https://www.sdsc.edu/support/user_guides/expanse.html), maybe >> the easiest fix is to remove it and use the srun command they document >> now? >> >> Yours, >> Roland >> >> > Hello Gabriele, >> > >> > hmm. >> > >> > > /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:148: >> > > -> The environment variable CACTUS_NUM_PROCS is set to 1, but there >> are 2 >> > > MPI processes. This may indicate a severe problem with the MPI startup >> > > mechanism. >> > >> > > IBRUN: launch command: srun -n 2 --ntasks-per-node 2 >> > > >> /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/SIMFACTORY/exe/cactus_sim >> >> > >> > Looking at these, I would have expected that CACTUS_NUM_PROCS is set to >> > 2 given that -n is 2 (being the number of MPI ranks). >> > >> > The current submitscript uses ibrun though current documentation uses >> > srun. Maybe changing to srun helps? Though the srun command does seem >> > to have 2 MPI procs in the way you expect to. >> > >> > Can you check (in the RunScript in >> > simulations/foo/output-0000/SIMFACTORY) what CACTUS_NUM_PROCS is set to? >> > >> > If this works with "regular" runs but fails with the testsuite using >> > --testsuite then the issue is most likely related to the complicated >> > method simfactory has to use to set the number of MPI ranks. >> > >> > I would check if the failing test is actually runnable only on 1 MPI >> > rank (set in test.ccl). In that case, Cactus will try to run in it in a >> > 2 MPI rank test suite but use only 1 MPI rank. Possibly ibrun ignores >> > Cactus' request and uses only information provided by SLURM. >> > >> > Yours, >> > Roland >> > >> >> -- >> My email is as private as my paper mail. I therefore support encrypting >> and signing email messages. Get my PGP key from http://pgp.mit.edu . >> >
_______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
