Re: [MTT devel] Uh testbake runs
Jeff Squyres wrote: > On Oct 3, 2007, at 4:12 AM, Mohamad Chaarawi wrote: > >> yea im doing that, but not for the already installed libraries.. >> that's >> the problem.. > > Ah - are you saying that we should add these kinds of fields for the > already-installed plugin: > >>> mpich2_additional_wrapper_ldflags = -L/opt/slurm/current/lib >>> mpich2_additional_wrapper_libs = -lpmi > > If so, I'm not sure we can -- these fields take advantage of the fact > that we know it's MPICH/MVAPICH and know exactly which bits to > twiddle in their script-based wrapper compilers. We can't assume > that an already-installed MPI will have a script-based wrapper > compiler that is exactly like MPICH's wrapper compilers. :-( > > Can you not use the already-installed MVAPICH and instead always > install it? I.e., is there a reason you're trying to use an already- > installed MVAPICH? > Yea, i can do that, but i thought that since we are downloading a release, it would be easier to just do it once and install it, and then use what we have just for time saving.. but it's not a big deal to download it and install it.. -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
Re: [MTT devel] Uh testbake runs
On Oct 3, 2007, at 4:12 AM, Mohamad Chaarawi wrote: yea im doing that, but not for the already installed libraries.. that's the problem.. Ah - are you saying that we should add these kinds of fields for the already-installed plugin: mpich2_additional_wrapper_ldflags = -L/opt/slurm/current/lib mpich2_additional_wrapper_libs = -lpmi If so, I'm not sure we can -- these fields take advantage of the fact that we know it's MPICH/MVAPICH and know exactly which bits to twiddle in their script-based wrapper compilers. We can't assume that an already-installed MPI will have a script-based wrapper compiler that is exactly like MPICH's wrapper compilers. :-( Can you not use the already-installed MVAPICH and instead always install it? I.e., is there a reason you're trying to use an already- installed MVAPICH? -- Jeff Squyres Cisco Systems
Re: [MTT devel] Uh testbake runs
yea im doing that, but not for the already installed libraries.. that's the problem.. On Tue, October 2, 2007 4:06 pm, Jeff Squyres wrote: > On Oct 2, 2007, at 9:53 PM, Mohamad Chaarawi wrote: > >> Yea i think those problems where when i was running from the already >> installed mpich2.. but doesn't mpich pick up slurm from the path >> directly? > > Yes, but you need to link against SLURM's libpmi specifically. In > the template, I have stuff like this: > > [MPI install: MPICH2] > mpi_get = mpich2 > save_stdout_on_success = 1 > merge_stdout_stderr = 0 > # Adjust this for your site (this is what works at Cisco). Needed to > # launch in SLURM; adding this to LD_LIBRARY_PATH here propagates this > # all the way through the test run phases that use this MPI install, > # where the test executables will need to have this set. > prepend_path = LD_LIBRARY_PATH /opt/slurm/current/lib > > module = MPICH2 > mpich2_compiler_name = gnu > mpich2_compiler_version = &get_gcc_version() > mpich2_configure_arguments = --disable-f90 CFLAGS=-O3 --enable-fast -- > with-device=ch3:nemesis > # These are needed to launch through SLURM; adjust as appropriate. > mpich2_additional_wrapper_ldflags = -L/opt/slurm/current/lib > mpich2_additional_wrapper_libs = -lpmi > > Note these last two fields ^^. You'll need to replace the -L value > with whatever is relevant for your cluster. > > >> On Tue, October 2, 2007 1:09 pm, Jeff Squyres (jsquyres) wrote: >>> I'm away from a computer right now so I don't have the specifics, >>> but we >>> saw some testbake results from UH today of mpich2 under slurm that >>> were >>> not run properly - it ran 16 copies of skampi instead of 1 16-node >>> job, so >>> the output was very skewed (and completely mis-parsed). >>> >>> Can you check your mpich2 compile / link settings to ensure that >>> you're >>> linking against the slurm pmi library properly? >>> >>> -jms >>> ___ >>> mtt-devel mailing list >>> mtt-de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel >>> >> >> >> -- >> Mohamad Chaarawi >> Instructional Assistanthttp://www.cs.uh.edu/~mschaara >> Department of Computer Science University of Houston >> 4800 Calhoun, PGH Room 526Houston, TX 77204, USA >> >> ___ >> mtt-devel mailing list >> mtt-de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > > > -- > Jeff Squyres > Cisco Systems > > ___ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > > -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
Re: [MTT devel] Uh testbake runs
On Oct 2, 2007, at 9:53 PM, Mohamad Chaarawi wrote: Yea i think those problems where when i was running from the already installed mpich2.. but doesn't mpich pick up slurm from the path directly? Yes, but you need to link against SLURM's libpmi specifically. In the template, I have stuff like this: [MPI install: MPICH2] mpi_get = mpich2 save_stdout_on_success = 1 merge_stdout_stderr = 0 # Adjust this for your site (this is what works at Cisco). Needed to # launch in SLURM; adding this to LD_LIBRARY_PATH here propagates this # all the way through the test run phases that use this MPI install, # where the test executables will need to have this set. prepend_path = LD_LIBRARY_PATH /opt/slurm/current/lib module = MPICH2 mpich2_compiler_name = gnu mpich2_compiler_version = &get_gcc_version() mpich2_configure_arguments = --disable-f90 CFLAGS=-O3 --enable-fast -- with-device=ch3:nemesis # These are needed to launch through SLURM; adjust as appropriate. mpich2_additional_wrapper_ldflags = -L/opt/slurm/current/lib mpich2_additional_wrapper_libs = -lpmi Note these last two fields ^^. You'll need to replace the -L value with whatever is relevant for your cluster. On Tue, October 2, 2007 1:09 pm, Jeff Squyres (jsquyres) wrote: I'm away from a computer right now so I don't have the specifics, but we saw some testbake results from UH today of mpich2 under slurm that were not run properly - it ran 16 copies of skampi instead of 1 16-node job, so the output was very skewed (and completely mis-parsed). Can you check your mpich2 compile / link settings to ensure that you're linking against the slurm pmi library properly? -jms ___ mtt-devel mailing list mtt-de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA ___ mtt-devel mailing list mtt-de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel -- Jeff Squyres Cisco Systems
Re: [MTT devel] Uh testbake runs
Yea i think those problems where when i was running from the already installed mpich2.. but doesn't mpich pick up slurm from the path directly? On Tue, October 2, 2007 1:09 pm, Jeff Squyres (jsquyres) wrote: > I'm away from a computer right now so I don't have the specifics, but we > saw some testbake results from UH today of mpich2 under slurm that were > not run properly - it ran 16 copies of skampi instead of 1 16-node job, so > the output was very skewed (and completely mis-parsed). > > Can you check your mpich2 compile / link settings to ensure that you're > linking against the slurm pmi library properly? > > -jms > ___ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > -- Mohamad Chaarawi Instructional Assistant http://www.cs.uh.edu/~mschaara Department of Computer ScienceUniversity of Houston 4800 Calhoun, PGH Room 526Houston, TX 77204, USA
[MTT devel] Uh testbake runs
I'm away from a computer right now so I don't have the specifics, but we saw some testbake results from UH today of mpich2 under slurm that were not run properly - it ran 16 copies of skampi instead of 1 16-node job, so the output was very skewed (and completely mis-parsed). Can you check your mpich2 compile / link settings to ensure that you're linking against the slurm pmi library properly? -jms