Dear Maui users,

Is there a chance that this maui bug will be resolved soon in a future patch? Is there a better workaround to this, other than setting a very high default server walltime and hope no user will exceed that?

Re-posting from torqueusers list.....

Thanks.

[EMAIL PROTECTED] wrote:
On Tue, Jun 06, 2006 at 12:00:49PM -0400, Neelesh Arora alleged:

[EMAIL PROTECTED] wrote:

On Fri, Jun 02, 2006 at 03:55:29PM -0400, Neelesh Arora alleged:


Hi,

We have torque-2.0.0p2 and maui-3.2.6p13 based setup. There are 2 execution queues, distinguished by different cpu-time limits. There is one route queue which routes the jobs based on the requested cpu-time.

The queue definitions have appropriate resources_max.cput and resources_min.cput declarations. And the users are required to specify -l cput=<time> option to qsub.

The jobs get submitted to the right queues, based on the cput parameter value. But then, Torque/Maui seem to be enforcing wall-time instead and a job is killed if the resources_used.walltime exceeds Resource_List.cput !!

For example, I submit a job with qsub -l cput=1:0:0 pbs-script. And the job takes more than 1hr in wallclock time, while the cpu-time usage is still less than 1hr. This job would be killed with a "MOAB_INFO: job exceeded wallclock limit" message.

We have not specified any walltime parameters in queue/server definitions or during job submission.

While the job is running, qstat reports both cputime and walltime usage. Whereas, checkjob only reports walltime usage:
_________________
qstat:
Job Id: 37414
 resources_used.cput = 00:00:00
 resources_used.mem = 2568kb
 resources_used.vmem = 9264kb
 resources_used.walltime = 00:37:10

checkjob:
checking job 37414
State: Running
Creds:  user:narora  group:staff  class:medium  qos:DEFAULT
WallTime: 00:37:48 of 1:00:00
_________________
where, clearly Maui has set the max allowed walltime to be the same as the cput value I specified to qsub !!

Can someone please suggest what's going wrong here?


Well that's an annoying maui bug!

Double check your maui config and disable the resource enforcement.


Our maui config does not explicitly define any resource limits: the cpu-time limits are set in the PBS queue config. Probably you are referring to some default maui parameters? But I could only find RESOURCELIMITPOLICY and WCVIOLATIONACTION parameters from the docs. Neither one has a straight forward 'disable' option.

Can you please elaborate on what you mean by disabling the resource enforcement?


Actually, I might have a better solution.  Set a default walltime at the
server level and maui won't use cput as a "guess" at a walltime.  You
can use a really long walltime to ot interfere with your users' current
habits.

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to