[slurm-dev] strange srun problem

2017-01-23 Thread liu junjun
Hi all, I have small MPI test program just printing the rannk id of a parallel job. The output is like this: >mpirun -n 2 ./mpitest Hello world: rank 0 of 2 running on cddlogin Hello world: rank 1 of 2 running on cddlogin I ran this test program with salloc. It produces similar output: >salloc

[slurm-dev] Re: strange srun problem

2017-01-23 Thread Paddy Doyle
Hi Junjun, On Mon, Jan 23, 2017 at 12:04:17AM -0800, liu junjun wrote: > Hi all, > > I have small MPI test program just printing the rannk id of a parallel job. > The output is like this: > >mpirun -n 2 ./mpitest > Hello world: rank 0 of 2 running on cddlogin > Hello world: rank 1 of 2 running

[slurm-dev] Re: Power outage causes wrong reports

2017-01-23 Thread Paddy Doyle
Hi Lucas, This old thread might help: https://groups.google.com/forum/#!topic/slurm-devel/TQcerLLEKAU Paddy On Fri, Jan 20, 2017 at 10:00:00AM -0800, Lucas Vuotto wrote: > > Hi all, > sreport was showing that an user was using more CPU hours per week > than available. After checking the

[slurm-dev] Re: strange srun problem

2017-01-23 Thread liu junjun
Hi Paddy, Thanks a lot for you kind helps! Replacing mpirun by srun seems still not working. Here's how I did: >cat a.sh #!/bin/bash srun ./mpitest >sbatch -n 2 ./a.sh Submitted batch job 3611 >cat slurm-3611.out Hello world: rank 0 of 1 running on cdd001 Hello world: rank 0 of 1 running on

[slurm-dev] Re: strange srun problem

2017-01-23 Thread Alexandre Strube
Hi, when I try the same with mpirun, I get this: -- An ORTE daemon has unexpectedly failed after launch and before communicating back to mpirun. This could be caused by a number of factors, including an inability to create a

[slurm-dev] RE: Partition QOS ignored

2017-01-23 Thread SLIM, HENK A.
I found a similar problem with 15.08.12. Adding --qos=batch in your case would set the QOS to batch but that's not what is desired. This thread claims it should work https://groups.google.com/forum/#!topic/slurm-devel/Zv1giZGtP-0 but I must be missing something. Best regards Henk >

[slurm-dev] Re: strange srun problem

2017-01-23 Thread TO_Webmaster
Is this OpenMPI? We experienced similar behaviour with OpenMPI. This was fixed after recompiling OpenMPI with PMI, i.e. ./configure [...] --with-pmi=/path/to/slurm [...] 2017-01-23 14:22 GMT+01:00 liu junjun : > Hi Paddy, > > Thanks a lot for you kind helps! > > Replacing

[slurm-dev] Re: New User Creation Issue

2017-01-23 Thread Lachlan Musicman
Interesting. To the best of my knowledge, if you are using Accounting, all users actually need to be in an association - ie having a user account is insufficient. An Association is a tuple consisting of: cluster, user, account and (optional) partition. Is that the problem? cheers L. -- The

[slurm-dev] Re: New User Creation Issue

2017-01-23 Thread Lachlan Musicman
Do your entries for SlurmUser match in slurm.conf and slurmdbd.conf? (I don't know the cause of your problem, but that's what I'd look at next). Also, does the AccountingStorageUser in slurm.conf match the user in slurmdbd.conf? cheers L. -- The most dangerous phrase in the language is,

[slurm-dev] Re: strange srun problem

2017-01-23 Thread r...@open-mpi.org
Note that 16.05 contains support for PMIx, so if you are using OMPI 2.0 or above, you should ensure that the slurm PMIx support is configured “on” and use that for srun (I believe you have to tell srun the pmi version to use, so perhaps “srun -mpi=pmix”?) > On Jan 23, 2017, at 7:10 AM,

[slurm-dev] New User Creation Issue

2017-01-23 Thread Katsnelson, Joe
Hi, I’m having an issue creating new users on our cluster. After running the below commands slurm has to be restarted in order for that user to be able to run sbatch. Otherwise they get an error ( sbatch test-job.sh sbatch: error: Batch job submission failed: Invalid account or

[slurm-dev] Re: New User Creation Issue

2017-01-23 Thread Katsnelson, Joe
We are planning to upgrade to a newer version of bright which comes with Slurm 16.05.8. We are currently running on slurm 14.11.11 so hopefully the newer version will address this issue as someone mentioned. (I check the configs the users match on both). The one thing that bright support

[slurm-dev] Re: New User Creation Issue

2017-01-23 Thread Katsnelson, Joe
So I’m associating the user with an account in this case called isg and after I restart slurm they can submit jobs. Below is how the user looks even before I restart slurm sacctmgr show assoc format=account%20,user,cluster,partition where user=testuser Account UserCluster