We've been trying out the use of gres to control access to our GPU. It works fine for a batch submission but when submitting via srun to get an interactive session we get the following error:
paciorek@machine:~/> srun --gres=gpu:1 /bin/bash srun: error: gres_plugin_job_state_unpack: no plugin configured to unpack data type 7696487 from job 10884 srun: gres_plugin_step_state_unpack: no plugin configured to unpack data type 7696487 from step 10884.0 srun: error: Task launch for 10884.0 failed on node scf-sm20: Invalid job credential srun: error: Application launch failed: Invalid job credential srun: Job step aborted: Waiting up to 2 seconds for job step to finish. srun: error: Timed out waiting for job step to complete We're running on set of Ubuntu 14.04 machines, with SLURM v. 2.6.5 (i.e., the slurm-llnl 2.6.5-1 Ubuntu package that is the latest for 14.04). We set up gconf in the way suggested in the SLURM documentation (here are the relevant lines from slurm.conf): GresTypes=gpu NodeName=our_gpu_nodename CPUs=24 SocketsPerBoard=2 CoresPerSocket=6 ThreadsPerCore=2 RealMemory=128908 TmpDisk=469325 Gres=gpu:1 Any ideas? Thanks, Chris ---------------------------------------------------------------------------------------------- Chris Paciorek Statistical Computing Consultant Statistical Computing Facility, Econometrics Laboratory, Berkeley Research Computing Office: 495 Evans Hall Email: [email protected] Mailing Address: Voice: 510-842-6670 Department of Statistics Fax: 510-642-7892 367 Evans Hall Skype: cjpaciorek University of California, Berkeley WWW: www.stat.berkeley.edu/~paciorek Berkeley, CA 94720 USA Permanent forward: [email protected]
