Dear all,

I am still having the problem that jobs that request all nodes of my small cluster stay queued for an unpredictable amount of time, but I have some additional information in the meanwhile:

pbsnodes reports that all nodes (7 in the following example) are free.

"diagnose -n" says that all nodes are Idle.

showq says that the job is in IDLE state.

The job can be started with qrun.

checkjob says "cannot select job 38 for partition DEFAULT (startdate in '00:00:01')".

This is what I see in the Maui log file:

INFO: 7 feasible tasks found for job 38:0 in partition DEFAULT (7 Needed)
ALERT: inadequate tasks to allocate to job 38:0 (1 < 7)
ERROR: cannot allocate nodes to job '38' in partition DEFAULT

What does the second line mean?

Looking back through the mailing list archive I have found some other emails that seem to be related to a similar problem, but I didn't find any answers to these emails. Like these other users, I am using the CPULOAD node allocation policy.

Here is what I am wondering: Could it be that Maui does not only use the relative CPU load on the different nodes for deciding which nodes to select, but that it also has an absolute threshold for the CPU load, and if the current load exceeds this threshold the node will not be allocated at all? If so, what is this threshold and can it be changed? (I have already tried setting the node availability policy to UTILIZED, which did not fix the problem.) The machines in my cluster are not pure TORQUE compute nodes. Thus, there might be other processes running on the machines, causing some CPU load. This is the main reason for selecting a node allocation policy that would prefer nodes with low CPU load for submitted TORQUE jobs.

Thanks for any tips you might have,

Jochen.

-------------------------

Jochen Ditterich, Ph.D.
Assistant Professor
Center for Neuroscience
University of California
1544 Newton Court
Davis, CA 95618
USA

office: +1 (530) 754-5084
lab:    +1 (530) 754-6987
fax:    +1 (530) 757-8827
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to