Hi,

I was running jobs on a single node, then recently added two more. The first 
node was capable of running only 4 jobs at a time, so added two more CPUs that 
can run run 8 jobs each and have confirmed that the nodes are up and running. 
However, if I submit a job, its only running on the old node, it only runs on 
the new nodes when I specify the job to run on three nodes using ‘sbatch –N3 
job.sh. Previously I ran my jobs using ‘sbatch –J jobname job.sh’, I wanted to 
know how I can specify the job to run on a particular node using sbatch, or 
automatically transfer it to a free node? Since we have added the two more 
nodes, when the maximum number of jobs have been submitted to the old node, its 
no longer queuing any new jobs, unless I specify the job to be shared on all 
three nodes. I have pasted some more info below:

Sinfo shows me:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST

debug*       up   infinite      3   idle bio-linux,bio-linuxnode2,node01


And scontrol show nodes:


[batsiraim@bio-linux ~]$ scontrol show nodes

NodeName=bio-linux Arch=x86_64 CoresPerSocket=4

   CPUAlloc=0 CPUErr=0 CPUTot=4 CPULoad=0.00 Features=(null)

   Gres=(null)

   NodeAddr=xxxx1 NodeHostName=bio-linux Version=15.08

   OS=Linux RealMemory=28000 AllocMem=0 FreeMem=5083 Sockets=1 Boards=1

   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A

   BootTime=2017-02-01T05:36:22 SlurmdStartTime=2017-06-18T08:24:56

   CapWatts=n/a

   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0

   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s




NodeName=bio-linuxnode2 Arch=x86_64 CoresPerSocket=4

   CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.00 Features=(null)

   Gres=(null)

   NodeAddr=xxxx2 NodeHostName=bio-linuxnode2 Version=15.08

   OS=Linux RealMemory=3800 AllocMem=0 FreeMem=2278 Sockets=1 Boards=1

   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A

   BootTime=2017-05-10T04:59:26 SlurmdStartTime=2017-06-18T08:25:50

   CapWatts=n/a

   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0

   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s




NodeName=node01 Arch=x86_64 CoresPerSocket=4

   CPUAlloc=0 CPUErr=0 CPUTot=8 CPULoad=0.01 Features=(null)

   Gres=(null)

   NodeAddr=xxxxx3 NodeHostName=node01 Version=15.08

   OS=Linux RealMemory=3800 AllocMem=0 FreeMem=827 Sockets=1 Boards=1

   State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A

   BootTime=2017-05-24T04:51:53 SlurmdStartTime=2017-06-18T08:25:38

   CapWatts=n/a

   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0

   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s



Thanks in advance.

Regards,

Batsirai


The views expressed in this email are, unless otherwise stated, those of the 
author and not those of the National Health Laboratory Service or its 
management. The information in this e-mail is confidential and is intended 
solely for the addressee.
Access to this e-mail by anyone else is unauthorized. If you are not the 
intended recipient, any disclosure, copying, distribution or any action taken 
or omitted in reliance on this, is prohibited and may be unlawful.
Whilst all reasonable steps are taken to ensure the accuracy and integrity of 
information and data transmitted electronically and to preserve the 
confidentiality thereof, no liability or responsibility whatsoever is accepted 
if information or data is, for whatever reason, corrupted or does not reach its 
intended destination.

Reply via email to