Hi All, Recently we've upgraded our storage system with more network cards so all the nodes in the cluster can see it. Since then, we tried running the command: sbatch –n <no_of_cores> -C[rack1|rack2|rack3] –c<no_of_cores> <script>
However, when the job is queued, the other jobs that I'm trying to send (to other nodes and features) using a normal sbatch command without a constraint is being queued as well even though there are free resources. When canceling the job with the -C options, jobs are queued and executed properly, it only happens when I send the job with the -C first. Why is this happening and how can I resolve it? Part of slurm.conf: Slurm Configuration: # SCHEDULING FastSchedule=0 SchedulerType=sched/backfill SchedulerPort=7321 SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory # # # JOB PRIORITY PriorityType=priority/basic # # JOB PREEMPTION (optional) PreemptMode=requeue PreemptType=preempt/partition_prio *Node Configuration:* #rack1 NodeName=hnmp[106-164] NodeAddr=X.X.X.[106-164] Sockets=2 CoresPerSocket=4 ThreadsPerCore=1 RealMemory=1 State=UNKNOWN Feature="par,rack1" NodeName=hnmp[101-105] NodeAddr=X.X.X.[101-105] Sockets=2 CoresPerSocket=4 ThreadsPerCore=1 RealMemory=1 State=UNKNOWN Feature="pls" # #rack2 NodeName=hnmp[27-80] NodeAddr=X.X.X.[27-80] Procs=12 RealMemory=1 State=UNKNOWN Feature="par,rack2" # #rack3 NodeName=hnmp[5001-5056] NodeAddr=X.X.X.[1-56] Procs=16 RealMemory=1 State=UNKNOWN Feature="par,rack3" #rack4 NodeName=hnmp[5057-5100] NodeAddr=X.X.X.[57-100] Procs=16 RealMemory=1 State=UNKNOWN Feature="par,rack4" *Partitions Properties:* #Partitions # # priority partitions # PartitionName=low Nodes=hnmp[106-164] Default=NO MaxTime=INFINITE State=UP Shared=NO Priority=10 PreemptMode=requeue PartitionName=hi Nodes=hnmp[106-164] Default=NO MaxTime=INFINITE State=UP Shared=NO Priority=30 PreemptMode=off PartitionName=med Nodes=hnmp[106-164] Default=NO MaxTime=INFINITE State=UP Shared=NO Priority=20 PreemptMode=off # lsdyna partiton PartitionName=lsall Nodes=hnmp[05-07,09-16] Default=NO MaxTime=INFINITE State=UP Shared=NO Priority=10 PreemptMode=off # Default partition # PartitionName=hnm Nodes=hnmp[01-16,18-26,101-164,27-80,165-176,181-196,5001-5100] Default=YES MaxTime=INFINITE State=UP Shared=NO Priority=20 PreemptMode=off # Backfill partition # PartitionName=hpc Nodes=hnmp[101-164,27-80,5001-5100] MaxTime=7-0 State=UP Shared=NO Priority=20 PreemptMode=off
