Hello Gabriele, Great. Thank you. I will backport to the release branch.
Yours, Roland > Hello, > > Reverting to srun fixes the problem. I updated the master branches for the > testsuite > results and simfactory. > > Gabriele > > On Thu, Jun 2, 2022 at 9:12 AM Gabriele Bozzola <[email protected]> > wrote: > > > Hi Roland, > > > > That sounds reasonable. I think I was originally using srun, but was > > recommended > > to move to ibrun. I will try with srun to see if it works, in which case I > > will update the > > simfactory entry and the testsuite results. > > > > Gabrieel > > > > On Thu, Jun 2, 2022 at 8:03 AM Roland Haas <[email protected]> wrote: > > > >> Hello Gabriele, > >> > >> ok, I can at least partially answer this. Indeed RNS's A2 test is code > >> to use only 1 MPI rank: > >> > >> TEST rnsA2 > >> { > >> PROCS 1 > >> } > >> > >> and thus the most likely reason is that ibrun just pulls the number of > >> MPI ranks from SLURM rather than from whatever simfactory tries to use. > >> > >> Since ibrun is no longer documented on the SDSC page (at least I do not > >> see it on > >> https://urldefense.com/v3/__https://www.sdsc.edu/support/user_guides/expanse.html__;!!DZ3fjg!6AKe0V5ww0am4Al2yttt_J0jlb9QxoSIlmt7-krRMsZOPCIG_Mo_95py9qR5wZ7lc_UNn5p5hqSBzROP7fbS0QzBYvI$ > >> ), maybe > >> the easiest fix is to remove it and use the srun command they document > >> now? > >> > >> Yours, > >> Roland > >> > >> > Hello Gabriele, > >> > > >> > hmm. > >> > > >> > > /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:148: > >> > > -> The environment variable CACTUS_NUM_PROCS is set to 1, but there > >> are 2 > >> > > MPI processes. This may indicate a severe problem with the MPI startup > >> > > mechanism. > >> > > >> > > IBRUN: launch command: srun -n 2 --ntasks-per-node 2 > >> > > > >> /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/SIMFACTORY/exe/cactus_sim > >> > >> > > >> > Looking at these, I would have expected that CACTUS_NUM_PROCS is set to > >> > 2 given that -n is 2 (being the number of MPI ranks). > >> > > >> > The current submitscript uses ibrun though current documentation uses > >> > srun. Maybe changing to srun helps? Though the srun command does seem > >> > to have 2 MPI procs in the way you expect to. > >> > > >> > Can you check (in the RunScript in > >> > simulations/foo/output-0000/SIMFACTORY) what CACTUS_NUM_PROCS is set to? > >> > > >> > If this works with "regular" runs but fails with the testsuite using > >> > --testsuite then the issue is most likely related to the complicated > >> > method simfactory has to use to set the number of MPI ranks. > >> > > >> > I would check if the failing test is actually runnable only on 1 MPI > >> > rank (set in test.ccl). In that case, Cactus will try to run in it in a > >> > 2 MPI rank test suite but use only 1 MPI rank. Possibly ibrun ignores > >> > Cactus' request and uses only information provided by SLURM. > >> > > >> > Yours, > >> > Roland > >> > > >> > >> -- > >> My email is as private as my paper mail. I therefore support encrypting > >> and signing email messages. Get my PGP key from > >> https://urldefense.com/v3/__http://pgp.mit.edu__;!!DZ3fjg!6AKe0V5ww0am4Al2yttt_J0jlb9QxoSIlmt7-krRMsZOPCIG_Mo_95py9qR5wZ7lc_UNn5p5hqSBzROP7fbSgjJHL7k$ > >> . > >> > > -- My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://pgp.mit.edu .
pgpw250OySfkd.pgp
Description: OpenPGP digital signature
_______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
