Re: [gridengine users] Resource Reservation logging

Txema Heredia Mon, 07 Oct 2013 08:36:03 -0700

I needed to increase the priority of the jobs of one user and I wasn'table to do so. No matter how many times I issued qalter -p 1024 -u user,the waiting queue remained the same.I have just rebooted the sge_qmaster daemon, et voilà, the jobs had itsproper priority and any job that was able to run was scheduled. Aftersimply rebooting it, now my cluster is using 292 (+36 reserved) slotsout of 320 total.

So it seems all this is a matter of qmaster degradation. This risesfurther questions, like how is it possible that the qmaster degradedthis far only after 4 days of turning on the reservations...


Thanks for all,

Txema


El 07/10/13 16:18, Txema Heredia escribió:

El 07/10/13 16:12, Reuti escribió:
Am 07.10.2013 um 16:09 schrieb Txema Heredia:
El 07/10/13 16:00, Reuti escribió:
Am 07.10.2013 um 15:59 schrieb Txema Heredia:
El 07/10/13 14:58, Reuti escribió:
Hi,

Am 07.10.2013 um 13:15 schrieb Txema Heredia:
The problem is that, right now, the mandatory usage of h_rt isnot an option. So we need to work considering that all jobs willlast to infinity and beyond.
Right now, the scheduler configuration is:
max_reservation 50
default_duration 24:00:00
During the weekend, most of the parallel ( and -R y) jobsstarted running, but now there is something fishy in my queues:
The first 3 jobs in my waiting queue belong to user1. All 3 jobsrequest -pe mpich_round 12, -R y and -l h_vmem=4G (h_vmem is setto consumable = YES, not JOB).
Which amount of memory did you specify in the exechostdefinition, i.e. what's in the machine physically?
-- Reuti
26 nodes have 96GB of ram. One node has 48GB.
And you defined it on an exechost level under "complex_values"? -Reuti
Yes, on all nodes.
# qconf -se c0-0 | grep h_vmem
complex_values        local_disk=400G,slots=12,h_vmem=96G
Good, what is the defintion of the requested PE - any special"allocation_rule"?
Round robin

# qconf -sp mpich_round
pe_name            mpich_round
slots              9999
user_lists         NONE
xuser_lists        NONE
start_proc_args /opt/gridengine/mpi/startmpi.sh -catch_rsh$pe_hostfile
stop_proc_args     /opt/gridengine/mpi/stopmpi.sh
allocation_rule    $round_robin
control_slaves     TRUE
job_is_first_task  FALSE
urgency_slots      min
accounting_summary FALSE
PS: I've been told that there are some problems with local_disk, butcurrently no job is making use of it
It may be a custom load sensor, it's nothing SGE provides by default.
Yes, it's simply a consumable attribute that does nothing. I have justbeen told that sometimes host-defined consumable attributes + parallelenvironments don't behave properly (over-requesting and the such), buthere shouldn't apply because none of the jobs is using it. We canignore it.
Currently nodes range from 4 to 10 free slots and from 26 to 82.1free GB
The first jobs in my waiting queue (after the 3 reserving ones)require measly 0.9G, 3G and 12G, all with slots=1 and -R n. Noneof them is scheduled. But if I manually increase their priority sothey are put BEFORE the 3 -R y jobs, they are immediately scheduled.
This user has already one job like these running. User1 has aRQS that limits him to use only 12 slots in the whole cluster.Thus the 3 waiting jobs will not be able to run until the firstone finishes.
This is the current schedule log:
# grep "::::\|RESERVING" schedule | tail -200 | grep"::::\|Q:all" | tail -37 | sort
::::::::
2734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734185:1:RESERVING:1381142325:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734186:1:RESERVING:1381228785:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.0000002734187:1:RESERVING:1381315245:86460:Q:[email protected]:slots:1.000000
Right now, the cluster is using 190 slots of 320 total. Theschedule log says that the 3 waiting jobs form user1 are theonly jobs making any kind of reservation. These jobs arereserving a total of 36 cores. These 3 jobs are effectivelyblocking 36 already-free slots because the RQS doesn't allowuser1 to make usage of more than 12 slots at once. This is not"nice" but I understand that the scheduler has its limitationsand cannot predict the future.
Taking into account the jobs running + the slots & memory lockedby the reserving jobs, there is a grand total of 226 slotslocked. Thus leaving 94 free slots.
Here comes the problem: Even though there are 94 free slots andlots of spare memory, NONE of the 4300 waiting jobs is running.There are nodes with 6 free slots and 59 GB of free RAM but noneof the waiting jobs is scheduled. New jobs only star runningwhen one of the 190 slots occupied by running jobs is freed.None of these other waiting jobs is requesting -R y, -pe nor h_rt.
Additionaly, this is creating some odd behaviour. It seems that,on each scheduler run, it is trying to start jobs in those"blocked slots", but it fails with no apparent reason. Some ofthe jobs are even trying to start twice, but almost none(generally none at all) gets to run:
# tail -2000 schedule | grep -A 1000 "::::::" | grep "Q:all" |grep STARTING | sort2734121:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734122:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734123:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734124:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734125:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734126:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734127:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734128:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734129:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734130:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734131:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734132:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734133:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734134:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734135:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734136:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734137:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734138:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734139:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734140:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734141:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734142:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734143:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734144:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734145:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734146:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734147:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734148:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734149:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734150:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734151:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734152:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734153:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734154:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734155:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734156:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734157:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734158:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734159:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734160:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002734161:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735158:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735159:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735160:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735161:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735162:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735163:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735164:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735165:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735166:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735167:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735168:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735169:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735170:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735171:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735172:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735173:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735174:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735175:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735176:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735177:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735178:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735179:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735180:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735181:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735182:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735183:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735184:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735185:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735186:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735187:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735188:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735189:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735190:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735191:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735192:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002735193:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743479:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743480:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743481:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743482:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743483:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743484:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743485:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743486:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743487:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743488:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743489:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743490:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743491:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743492:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743493:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743494:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743495:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743496:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743497:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743498:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743499:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743500:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743501:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743502:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743503:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743504:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743505:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743506:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743507:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743508:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743509:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743510:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743511:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743512:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743513:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743514:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743515:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743516:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743517:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.0000002743518:1:STARTING:1381144160:86460:Q:[email protected]:slots:1.000000
Even though jobs appear here listed as "starting" they are notrunning at all. But they are issuing a "starting" message oneach scheduling interval.
Why are the reservations blocking a third of the cluster??? Itshouldn't be a backfilling issue, they are blocking the usage of3 times the slots reserved. Why the "starting" jobs cannot run?
Txema



El 07/10/13 09:28, Christian Krause escribió:
Hello,
We solved it the way that `h_rt` is set to FORCED in thecomplex list:
#name shortcut type reloprequestable consumable default urgency
#------------------------------------------------------------------------------------------------
h_rt h_rt TIME <=FORCED YES 0:0:0 0
And have a JSV rejecting jobs that don't request it (becausethey would be pending indefinetely
unless you have a default duration or use qalter).
You could also use a JSV to enforce that only jobs with largeresources (in your case more than some
amount of slots) are able to request reservation, i.e.:

     # pseudo JSV code
          SLOT_RESERVATION_THRESHOLD=...
          if slots < SLOT_RESERVATION_THRESHOLD then
         "disable reservation / reject"
     else
         "enable reservation"
     fi


On Fri, Oct 04, 2013 at 04:25:29PM +0200, Txema Heredia wrote:
Hi all,
I have a 27-node cluster. Currently there are 320 out of 320slots
filled up. All by jobs requesting 1-slot.

At the top of my waiting queue there are 28 different jobs
requesting 3 to 12 cores using two different parallelenvironments.
All these jobs are requesting -R y. They are being ignored and
overrun by the myriad of 1-slot requesting jobs behind themin the
waiting queue.

I have enabled the scheduler logging. During the last 4 hours, it
has logged 724 new jobs starting, in all the 27 nodes. Not asinglejob on the system is requesting -l h_rt, but single-core jobskeep
being scheduled  and all the parallel jobs are starving.
As far as I understand, the backfilling is killing myreservations,
even if no one is requesting any kind of time, but if I set the
"default_duration" to INFINITY, all the RESERVING log messages
disappear.
Additionaly, for some odd reason, I only receive RESERVINGmessagesfrom the jobs requesting a given number of slots (-pe whateverN).The jobs requesting a slot-range (-pe threaded 4-10) seem toreserve
nothing.

My scheduler configuration is as follows:

# qconf -ssconf
algorithm                         default
schedule_interval                 0:0:5
maxujobs                          0
queue_sort_method                 load
job_load_adjustments              np_load_avg=0.50
load_adjustment_decay_time        0:7:30
load_formula                      np_load_avg
schedd_job_info                   true
flush_submit_sec                  0
flush_finish_sec                  0
params                            MONITOR=1
reprioritize_interval             0:0:0
halftime                          168
usage_weight_list cpu=0.187000,mem=0.116000,io=0.697000
compensation_factor               5.000000
weight_user                       0.250000
weight_project                    0.250000
weight_department                 0.250000
weight_job                        0.250000
weight_tickets_functional         1000000000
weight_tickets_share              1000000000
share_override_tickets            TRUE
share_functional_shares           TRUE
max_functional_jobs_to_schedule   200
report_pjob_tickets               TRUE
max_pending_tasks_per_job         50
halflife_decay_list               none
policy_hierarchy                  OSF
weight_ticket                     0.010000
weight_waiting_time               0.000000
weight_deadline                   3600000.000000
weight_urgency                    0.100000
weight_priority                   1.000000
max_reservation                   50
default_duration                  24:00:00


I have also tested it with params PROFILE=1 and default_duration
INFINITY. But, when I set it, not a single reservation islogged in/opt/gridengine/default/common/schedule and new jobs keepstarting.
What am I missing? Is it possible to kill the backfilling? Are my
reservations really working?

Thanks in advance,

Txema
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Resource Reservation logging

Reply via email to