Hello, > I have a problem that jobs appear to be not routing to the correct > queue. My set up is as follows: > > routing queue > 2h queue > 12h queue > 1w queue > unspecified time queue (max time 2w) > guest queue (low priority) > > If a time is unspecified at job submission a default time of 2w (336h) is set > > The routing queue is setup as follows (as taken from qmgr -c 'print server') > > create queue route > set queue route queue_type = Route > set queue route route_destinations = short_2h > set queue route route_destinations += med_12h > set queue route route_destinations += long_1w > set queue route route_destinations += unspec > set queue route route_destinations += guest > set queue route enabled = True > set queue route started = True > > my problem is that some jobs with unspecified time (which have > correctly been given a time of 336h) are ending up in the short_2h > queue, which has a higher priority than other queues. Does anyone know > of any possible explanation for this?
Here is what you can read in Torque Admin Manual: "The time of enforcement of server and queue defaults is important in this example. TORQUE applies server and queue defaults differently in job centric and queue centric modes. For job centric mode, TORQUE waits to apply the server and queue defaults until the job is assigned to its final execution queue. For queue centric mode, it enforces server defaults before it is placed in the routing queue. In either mode, queue defaults override the server defaults. TORQUE defaults to job centric mode. To set queue centric mode, set queue_centric_limits, as in what follows: qmgr set server queue_centric_limits = true" I think that it should work. Another way would be to define the route_destinations the other way around, making sure to have resources_min and resources_max for all execution queues. If unspec is first, job with unspecified resource limits will go there first, regardless of the queue_centric_limits setting. Yet another way to make this work is to make sure that every job has a walltime limit. At our site, the default walltime limit is 0, so people have to specify it explicitly. You can however make sure that the limit is present by using a submit filter that adds a walltime limit to the script if it is not present. Hope this helps, -- Michel Béland, analyste en calcul scientifique [email protected] bureau S-250, pavillon Roger-Gaudry (principal), Université de Montréal téléphone : 514 343-6111 poste 3892 télécopieur : 514 343-2155 RQCHP (Réseau québécois de calcul de haute performance) www.rqchp.qc.ca _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
