Chris, can the user start an 'srun' session? ________________________________________ From: Chris Woelkers - NOAA Affiliate [chris.woelk...@noaa.gov] Sent: 03 April 2017 20:31 To: slurm-dev Subject: [slurm-dev] Fwd: job requeued in held state
I am running a small HPC, only 24 nodes, via slurm and am having an issue where one of the users is unable to submit any jobs. The user is new and whenever a job is submitted it shows the "job requeued in held state" state and is never actually ran. We have left the job sitting for over three days and it does not start. We have tried releasing the job and it does not start. Here are the log entries after an attempted release: [2017-04-03T19:16:24.173] sched: update_job: releasing hold for job_id 1938 uid 0 [2017-04-03T19:16:24.174] _slurm_rpc_update_job complete JobId=1938 uid=0 usec=375 [2017-04-03T19:16:24.919] sched: Allocate JobId=1938 NodeList=rhinonode[07-14] #CPUs=192 [2017-04-03T19:16:25.017] _slurm_rpc_requeue: Processing RPC: REQUEST_JOB_REQUEUE from uid=0 [2017-04-03T19:16:25.035] Requeuing JobID=1938 State=0x0 NodeCnt=0 The user has the same permissions as the older users that can run jobs. The script that is being run is a simple test script and no matter where the output is redirected, an NFS mount(for our SAN), the local home directory, or the tmp directory, the result is the same. Any idea as to what might be happening? Thanks, Chris Woelkers Caelum Research Corp. Linux Server and Network Administrator NOAA GLERL Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP