Re: [slurm-users] ssh-keys on compute nodes?

2020-06-08 Thread Ole Holm Nielsen
On 08-06-2020 18:07, Jeffrey T Frey wrote: There's a Slurm PAM module you can use to gate ssh access -- basically it checks to see if the user has a job running on the node and moves any ssh sessions to the first cgroup associated with that user on that node. If you don't use cgroup resource

Re: [slurm-users] Job failure issue in Slurm

2020-06-08 Thread navin srivastava
Thanks sathish. All other jobs are running fine across the cluster so I don't think it is related to any pam module issue. I am investigating issue further.i will come back to you with more details Regards Navin On Mon, Jun 8, 2020, 19:24 sathish wrote: > Hi Navin, > > Was this working

Re: [slurm-users] ssh-keys on compute nodes?

2020-06-08 Thread Durai Arasan
Hi Jeffrey, Thanks for the clarification. But this is concerning, as the users will be able to ssh into any node. How do you prevent that? Best, Durai On Mon, Jun 8, 2020 at 5:55 PM Jeffrey T Frey wrote: > User home directories are on a shared (NFS) filesystem that's mounted on > every node.

Re: [slurm-users] ssh-keys on compute nodes?

2020-06-08 Thread Durai Arasan
Ok, that was useful information. So when you provision user accounts, you add the public key to .ssh/authorized_keys of **all* *nodes on the cluster? Not just the login nodes.. ? > When we provision user accounts on our Slurm cluster we still add .ssh, > .ssh/id_rsa (needed for older X11

Re: [slurm-users] ssh-keys on computenodes?

2020-06-08 Thread gilles
Durai, A high quality MPI implementation uses SLURM (e.g. srun) to spawn MPI tasks/daemons, and hence does not require (passwordless) SSH between nodes. Cheers, Gilles - Original Message - Hi, we are setting up a slurm cluster and are at the stage of adding ssh keys of the

Re: [slurm-users] ssh-keys on compute nodes?

2020-06-08 Thread Jeffrey T Frey
An MPI library with tight integration with Slurm (e.g. Intel MPI, Open MPI) can use "srun" to start the remote workers. In some cases "srun" can be used directly for MPI startup (e.g. "srun" instead of "mpirun"). Other/older MPI libraries that start remote processes using "ssh" would,

[slurm-users] ssh-keys on compute nodes?

2020-06-08 Thread Durai Arasan
Hi, we are setting up a slurm cluster and are at the stage of adding ssh keys of the users to the nodes. I thought it would be sufficient to add the ssh keys of the users to only the designated login nodes. But I heard that it is also necessary to add them to the compute nodes as well for slurm

Re: [slurm-users] How to view GPU indices of the completed jobs?

2020-06-08 Thread sathish
Using sacct you can find those information, try the below options and see if that works. sacct -j --format=jobid,ReqTRES%50,ReqGres On Thu, Jun 4, 2020 at 1:30 PM Kota Tsuyuzaki < kota.tsuyuzaki...@hco.ntt.co.jp> wrote: > Hello Guys, > > We are running GPU clusters with Slurm and SlurmDBD

Re: [slurm-users] Job failure issue in Slurm

2020-06-08 Thread sathish
Hi Navin, Was this working earlier or is this the first time are you trying ? Are you using pam module ? if yes, try disabling the pam module and see if it works. Thanks Sathish On Thu, Jun 4, 2020 at 10:47 PM navin srivastava wrote: > Hi Team, > > i am seeing a weird issue in my environment.

Re: [slurm-users] Intermittent problem at 32 CPUs

2020-06-08 Thread Diego Zuccato
Il 07/06/20 09:44, Diego Zuccato ha scritto: >> I'm *guessing* that you are tripping over the use of "--tasks 32" on a >> heterogeneous cluster, > If you mean that using "--tasks 32" trips the use of a second node, then > no. The node does have two AMD Opteron 6274 . [...] > I've had a similar