Hello,
We solved it the way that `h_rt` is set to FORCED in the complex list:
#name shortcut type relop requestable
consumable default urgency
#------------------------------------------------------------------------------------------------
h_rt h_rt TIME <= FORCED YES
0:0:0 0
And have a JSV rejecting jobs that don't request it (because they would be
pending indefinetely
unless you have a default duration or use qalter).
You could also use a JSV to enforce that only jobs with large resources (in
your case more than some
amount of slots) are able to request reservation, i.e.:
# pseudo JSV code
SLOT_RESERVATION_THRESHOLD=...
if slots < SLOT_RESERVATION_THRESHOLD then
"disable reservation / reject"
else
"enable reservation"
fi
On Fri, Oct 04, 2013 at 04:25:29PM +0200, Txema Heredia wrote:
> Hi all,
>
> I have a 27-node cluster. Currently there are 320 out of 320 slots
> filled up. All by jobs requesting 1-slot.
>
> At the top of my waiting queue there are 28 different jobs
> requesting 3 to 12 cores using two different parallel environments.
> All these jobs are requesting -R y. They are being ignored and
> overrun by the myriad of 1-slot requesting jobs behind them in the
> waiting queue.
>
> I have enabled the scheduler logging. During the last 4 hours, it
> has logged 724 new jobs starting, in all the 27 nodes. Not a single
> job on the system is requesting -l h_rt, but single-core jobs keep
> being scheduled and all the parallel jobs are starving.
>
> As far as I understand, the backfilling is killing my reservations,
> even if no one is requesting any kind of time, but if I set the
> "default_duration" to INFINITY, all the RESERVING log messages
> disappear.
>
> Additionaly, for some odd reason, I only receive RESERVING messages
> from the jobs requesting a given number of slots (-pe whatever N).
> The jobs requesting a slot-range (-pe threaded 4-10) seem to reserve
> nothing.
>
> My scheduler configuration is as follows:
>
> # qconf -ssconf
> algorithm default
> schedule_interval 0:0:5
> maxujobs 0
> queue_sort_method load
> job_load_adjustments np_load_avg=0.50
> load_adjustment_decay_time 0:7:30
> load_formula np_load_avg
> schedd_job_info true
> flush_submit_sec 0
> flush_finish_sec 0
> params MONITOR=1
> reprioritize_interval 0:0:0
> halftime 168
> usage_weight_list cpu=0.187000,mem=0.116000,io=0.697000
> compensation_factor 5.000000
> weight_user 0.250000
> weight_project 0.250000
> weight_department 0.250000
> weight_job 0.250000
> weight_tickets_functional 1000000000
> weight_tickets_share 1000000000
> share_override_tickets TRUE
> share_functional_shares TRUE
> max_functional_jobs_to_schedule 200
> report_pjob_tickets TRUE
> max_pending_tasks_per_job 50
> halflife_decay_list none
> policy_hierarchy OSF
> weight_ticket 0.010000
> weight_waiting_time 0.000000
> weight_deadline 3600000.000000
> weight_urgency 0.100000
> weight_priority 1.000000
> max_reservation 50
> default_duration 24:00:00
>
>
> I have also tested it with params PROFILE=1 and default_duration
> INFINITY. But, when I set it, not a single reservation is logged in
> /opt/gridengine/default/common/schedule and new jobs keep starting.
>
>
> What am I missing? Is it possible to kill the backfilling? Are my
> reservations really working?
>
> Thanks in advance,
>
> Txema
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
--
Mit freundlichen Grüßen /
With kind regards
-------------------------------------------
Christian Krause
Wissenschaftliche und Kaufmännische Datenverarbeitung /
Scientific and Commercial Data Processing (WKDV)
Wissenschaftliches Rechnen und Wissenschaftliches Datenmanagement /
Scientific Computing and Scientific Data Management (WRWD)
-------------------------------------------
Helmholtz-Zentrum für Umweltforschung GmbH - UFZ /
Helmholtz Centre for Environmental Research - UFZ
Permoserstr. 15 / 04318 Leipzig / Germany
phone +49 341 235 1001 / fax +49 341 235 1468
<[email protected]> / <http://www.ufz.de>
Sitz der Gesellschaft: Leipzig
Registergericht: Amtsgericht Leipzig, Handelsregister Nr. B 4703
Vorsitzender des Aufsichtsrats: MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer: Prof. Dr. Georg Teutsch
Administrative Geschäftsführerin: Dr. Heike Graßmann
-------------------------------------------
Bitte denken Sie an die Umwelt bevor Sie diese E-Mail ausdrucken. /
Please consider the environment before printing this e-mail.
-------------------------------------------
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users