Hi all,
I have small MPI test program just printing the rannk id of a parallel job.
The output is like this:
>mpirun -n 2 ./mpitest
Hello world: rank 0 of 2 running on cddlogin
Hello world: rank 1 of 2 running on cddlogin
I ran this test program with salloc. It produces similar output:
>salloc
Hi Junjun,
On Mon, Jan 23, 2017 at 12:04:17AM -0800, liu junjun wrote:
> Hi all,
>
> I have small MPI test program just printing the rannk id of a parallel job.
> The output is like this:
> >mpirun -n 2 ./mpitest
> Hello world: rank 0 of 2 running on cddlogin
> Hello world: rank 1 of 2 running
Hi Lucas,
This old thread might help:
https://groups.google.com/forum/#!topic/slurm-devel/TQcerLLEKAU
Paddy
On Fri, Jan 20, 2017 at 10:00:00AM -0800, Lucas Vuotto wrote:
>
> Hi all,
> sreport was showing that an user was using more CPU hours per week
> than available. After checking the
Hi Paddy,
Thanks a lot for you kind helps!
Replacing mpirun by srun seems still not working. Here's how I did:
>cat a.sh
#!/bin/bash
srun ./mpitest
>sbatch -n 2 ./a.sh
Submitted batch job 3611
>cat slurm-3611.out
Hello world: rank 0 of 1 running on cdd001
Hello world: rank 0 of 1 running on
Hi,
when I try the same with mpirun, I get this:
--
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a
I found a similar problem with 15.08.12. Adding --qos=batch in your case would
set the QOS to batch but that's not what is desired.
This thread claims it should work
https://groups.google.com/forum/#!topic/slurm-devel/Zv1giZGtP-0
but I must be missing something.
Best regards
Henk
>
Is this OpenMPI? We experienced similar behaviour with OpenMPI. This
was fixed after recompiling OpenMPI with PMI, i.e.
./configure [...] --with-pmi=/path/to/slurm [...]
2017-01-23 14:22 GMT+01:00 liu junjun :
> Hi Paddy,
>
> Thanks a lot for you kind helps!
>
> Replacing
Interesting. To the best of my knowledge, if you are using Accounting, all
users actually need to be in an association - ie having a user account is
insufficient.
An Association is a tuple consisting of: cluster, user, account and
(optional) partition.
Is that the problem?
cheers
L.
--
The
Do your entries for SlurmUser match in slurm.conf and slurmdbd.conf? (I
don't know the cause of your problem, but that's what I'd look at next).
Also, does the AccountingStorageUser in slurm.conf match the user in
slurmdbd.conf?
cheers
L.
--
The most dangerous phrase in the language is,
Note that 16.05 contains support for PMIx, so if you are using OMPI 2.0 or
above, you should ensure that the slurm PMIx support is configured “on” and use
that for srun (I believe you have to tell srun the pmi version to use, so
perhaps “srun -mpi=pmix”?)
> On Jan 23, 2017, at 7:10 AM,
Hi,
I’m having an issue creating new users on our cluster. After running the
below commands slurm has to be restarted in order for that user to be able to
run sbatch. Otherwise they get an error ( sbatch test-job.sh
sbatch: error: Batch job submission failed: Invalid account or
We are planning to upgrade to a newer version of bright which comes with Slurm
16.05.8. We are currently running on slurm 14.11.11 so hopefully the newer
version will address this issue as someone mentioned. (I check the configs the
users match on both).
The one thing that bright support
So I’m associating the user with an account in this case called isg and after I
restart slurm they can submit jobs. Below is how the user looks even before I
restart slurm
sacctmgr show assoc format=account%20,user,cluster,partition where user=testuser
Account UserCluster
13 matches
Mail list logo