try run "ldd your_mpi_program" on the compute node and make sure that
the SLURM pmi library is used. 

在 2012-06-29五的 15:13 -0600,Sarah Mulholland写道:
> This is exactly what I see when I run the command below.  
> 
> I rebuilt mpich2 with slurm.  I had to set CFLAGS and CXXFLAGS and LIBS to 
> point to incorporate -I/path/to/slurm/incl and -L/path/to/slurm/lib to get it 
> to build with --with-pmi=slurm --with-pm=no and --slurm=/path/to/slurm since 
> the last doesn't appear to have an effect.  
> 
> When I run the command below, I still see four unique SLURM_PROCIDs.  
> 
> I have slurm-2.3.5 and mpich2-1.4.1p1.  Does anybody else run with these two 
> packages and versions?  If so, would you mind sending me your configure 
> flags?  Any other suggestions would be appreciated.
> 
> Thanks,
> 
> Sarah
> 
> -----Original Message-----
> From: Moe Jette [mailto:[email protected]] 
> Sent: Friday, June 29, 2012 9:18 AM
> To: slurm-dev
> Subject: [slurm-dev] FW: slurm and mpich2
> 
> 
> SLURM's PMI library gets the task rank from the environment variable 
> SLURM_PROCID. I believe that you are not using that library. You can at least 
> confirm that the environment variable is set correctly by running something 
> like this:
> $ srun -n4 -l printenv SLURM_PROCID | sort
> 0: 0
> 1: 1
> 2: 2
> 3: 3
> 
> If you get output like above then the problem is definitely that of not using 
> SLURM's PMI library.
> 
> 
> Quoting Sarah Mulholland <[email protected]>:
> 
> > I should add that I have MpiDefault=none in my slurm configuration 
> > file as suggested by the slurm configuration tool.
> >
> > From: Sarah Mulholland
> > Sent: Thursday, June 28, 2012 3:59 PM
> > To: '[email protected]'
> > Subject: slurm and mpich2
> >
> > I have installed slurm-2.3.5 and mpich2-1.4.1p1.  We are using the 
> > hydra process manager for mpich.  As suggested on the ANL web site, I 
> > installed configured mpich2 with -with-hydra-bss=ssh,rsh,fork,slurm.  
> > Yet when I launch a process with srun all tasks are rank 0.
> >
> > I tried building mpich2 with slurm's native PMI library by configuring  
> > --with-pmi=slurm -with-pm=no -with-slurm=[/our/path/here], but 
> > autoconf didn't find slurm in the given location.
> >
> > Has anybody else experienced this?  Any suggestions?
> >
> 

Reply via email to