We've been trying out the use of gres to control access to our GPU. It
works fine for a batch submission but when submitting via srun to get
an interactive session we get the following error:

paciorek@machine:~/> srun --gres=gpu:1 /bin/bash
srun: error: gres_plugin_job_state_unpack: no plugin configured to
unpack data type 7696487 from job 10884
srun: gres_plugin_step_state_unpack: no plugin configured to unpack
data type 7696487 from step 10884.0
srun: error: Task launch for 10884.0 failed on node scf-sm20: Invalid
job credential
srun: error: Application launch failed: Invalid job credential
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
srun: error: Timed out waiting for job step to complete

We're running on set of Ubuntu 14.04 machines, with SLURM v. 2.6.5
(i.e., the slurm-llnl 2.6.5-1 Ubuntu package that is the latest for
14.04).

We set up gconf in the way suggested in the SLURM documentation (here
are the relevant lines from slurm.conf):
GresTypes=gpu
NodeName=our_gpu_nodename CPUs=24 SocketsPerBoard=2 CoresPerSocket=6
ThreadsPerCore=2 RealMemory=128908 TmpDisk=469325 Gres=gpu:1

Any ideas?

Thanks,
Chris

----------------------------------------------------------------------------------------------
Chris Paciorek

Statistical Computing Consultant
Statistical Computing Facility, Econometrics Laboratory, Berkeley
Research Computing

Office: 495 Evans Hall                      Email: [email protected]
Mailing Address:                            Voice: 510-842-6670
Department of Statistics                    Fax:   510-642-7892
367 Evans Hall                              Skype: cjpaciorek
University of California, Berkeley          WWW:
www.stat.berkeley.edu/~paciorek
Berkeley, CA 94720 USA                      Permanent forward:
[email protected]

Reply via email to