Dear All,

Any tips on troubleshooting when jobs are waiting even when I have plenty of 
idle nodes. I have looked at FAQ and made sure noting obvious is falling out.

I have PriorityType=priority/multifactor enabled & SchedulerType=sched/backfill

Also I can successfully run jobs in a partition by just running srun 
/bin/hostname and use up the entire queue without any issues.
But when I have sbatch jobs queued up they just stay there for eternity. I had 
bumped up the priority and saw no change.
Reason for waiting in the queue was originally priority and then when I bumped 
it up reason becomes none.

I have about 100,000 jobs waiting in the queue, so debugging is becoming a 
little painful and chatty.

Any hints/options to debug this will be very helpful.
Please advise.

Thank you,
Amit


Reply via email to