Re: [gridengine users] load_formula and PE jobs

Reuti Tue, 14 Aug 2012 09:33:06 -0700

Am 14.08.2012 um 18:14 schrieb Joseph Farran:

> On 08/14/2012 02:31 AM, Reuti wrote:
>> Am 14.08.2012 um 00:27 schrieb Joseph Farran:
>> 
>>> Hi Alex.
>>> 
>>> Thanks for the info, but the issue is more complex.
>>> 
>>> The issue is that slots cannot be used with Subordinate queues.
>>> 
>>> Why not?   Reason is here:
>>> 
>>>    http://gridengine.org/pipermail/users/2012-August/004372.html
>> But it seems working even if you don't attach the slots complex to each 
>> exechost to pack (at least) serial jobs by "load_formula slots".
> 
> No it is not.   At least in my tests.   See:
> 
>    http://gridengine.org/pipermail/users/2012-August/004425.html


Hehe, I tested it of course. But it turned out, that the behavior was obviously 
only correct by accident. The slots complex needs to be attached to each 
exechost.

Sorry for the confusion.

-- Reuti



> Joseph
> 
> 
>> 
>> 
>>> Best,
>>> Joseph
>>> 
>>> On 08/13/2012 03:12 PM, Alex Chekholko wrote:
>>>> Hi,
>>>> 
>>>> I'm not sure if this helps, but we have a working config with:
>>>> queue_sort_method                 seqno
>>>> load_formula                      slots
>>>> 
>>>> That puts single-slot jobs onto a single node if a bunch of nodes are 
>>>> empty, rather than distributing them evenly across empty nodes.
>>>> 
>>>> Regards,
>>>> Alex
>>>> 
>>>> On 08/13/2012 09:14 AM, Joseph Farran wrote:
>>>>> Hi Reuti / Rayson.
>>>>> 
>>>>> To make we are on the same page, are you saying that for PE jobs using
>>>>> "$pe_slots" for the "allocation_rule", that Grid Engine does indeed
>>>>> ignore the "load_formula" on the scheduler?
>>>>> 
>>>>> 
>>>>> If Yes, a couple of questions please:
>>>>> 
>>>>>     1) Was there a point in which GE did *not* ignore the
>>>>> "load_formula" for PE jobs "$pe_slots"?
>>>>>     2) Will this be brought back to GE on a future release?
>>>>> 
>>>>> Joseph
>>>>> 
>>>>> 
>>>>> 
>>>>> On 08/13/2012 08:22 AM, Reuti wrote:
>>>>>> Am 12.08.2012 um 19:55 schrieb Joseph Farran:
>>>>>> 
>>>>>>> Hi Rayson.
>>>>>>> 
>>>>>>> Here is one particular entry:
>>>>>>> http://gridengine.org/pipermail/users/2012-May/003495.html
>>>>>>> 
>>>>>>> I am using Grid Engine 2011.11 binary
>>>>>>> http://dl.dropbox.com/u/47200624/respin/ge2011.11.tar.gz
>>>>>> First of all sorry for using the wrong expression. If you used
>>>>>> "-cores_in_use", it should be the positive "slots". As a lower value
>>>>>> is taken first, a lower remaining number of slots should be taken
>>>>>> first. It's working as it should for serial jobs.
>>>>>> 
>>>>>> But for parallel ones, even with $pe_slots as allocation rule, it's
>>>>>> ignored already in 6.2u5.
>>>>>> 
>>>>>> -- Reuti
>>>>>> 
>>>>>> 
>>>>>>> Thanks,
>>>>>>> Joseph
>>>>>>> 
>>>>>>> On 8/12/2012 10:10 AM, Rayson Ho wrote:
>>>>>>>> On Sun, Aug 12, 2012 at 5:27 AM, Joseph Farran<[email protected]>   
>>>>>>>> wrote:
>>>>>>>>> I saw some old postings that this used to be a bug with GE, that
>>>>>>>>> parallel
>>>>>>>>> jobs were not using the scheduler load_formula.   Was this bug
>>>>>>>>> corrected in
>>>>>>>>> GE2011.11 ?
>>>>>>>> Hi Joseph,
>>>>>>>> 
>>>>>>>> Can you point me to the previous discussion? We did not receive bug
>>>>>>>> report related to this problem before...
>>>>>>>> 
>>>>>>>> So far, our main focus is to fix issues&   bugs reported by our users
>>>>>>>> first, and may be we've missed the discussion on this bug.
>>>>>>>> 
>>>>>>>> Rayson
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Anyone able to test this in GE2011.11 to see if it was fixed?
>>>>>>>>> 
>>>>>>>>> Joseph
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 8/11/2012 1:51 PM, Reuti wrote:
>>>>>>>>>> Am 11.08.2012 um 20:30 schrieb Joseph Farran:
>>>>>>>>>> 
>>>>>>>>>>> Yes, all my queues have the same "0" for "seq_no".
>>>>>>>>>>> 
>>>>>>>>>>> Here is my scheduler load formula:
>>>>>>>>>>> 
>>>>>>>>>>> qconf -ssconf
>>>>>>>>>>> algorithm                         default
>>>>>>>>>>> schedule_interval                 0:0:15
>>>>>>>>>>> maxujobs                          0
>>>>>>>>>>> queue_sort_method                 load
>>>>>>>>>>> job_load_adjustments              NONE
>>>>>>>>>>> load_adjustment_decay_time        0
>>>>>>>>>>> load_formula                      -cores_in_use
>>>>>>>>>> Can you please try it with -slots? It should behave the same like
>>>>>>>>>> your own
>>>>>>>>>> complex. In one of your former post you mentioned a different
>>>>>>>>>> relation ==
>>>>>>>>>> for it.
>>>>>>>>>> 
>>>>>>>>>> -- Reuti
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> Here is a sample display of what is going on.  My compute nodes
>>>>>>>>>>> have 64
>>>>>>>>>>> cores each:
>>>>>>>>>>> 
>>>>>>>>>>> I submit 4 1-core jobs to my bio queue.   Note:  I wait around 30
>>>>>>>>>>> seconds
>>>>>>>>>>> before submitting each 1-core job, long enough for my
>>>>>>>>>>> "cores_in_use" to
>>>>>>>>>>> report back correctly:
>>>>>>>>>>> 
>>>>>>>>>>> job-ID   name  user    state  queue             slots
>>>>>>>>>>> -----------------------------------------------------
>>>>>>>>>>>    2324  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2325  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2326  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2327  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>> 
>>>>>>>>>>> Everything works great with single 1-core jobs.   Jobs 2324
>>>>>>>>>>> through 2327
>>>>>>>>>>> packed unto one node ( compute-2-3 ) correctly. The
>>>>>>>>>>> "cores_in_use" for
>>>>>>>>>>> compute-2-3 reports "4".
>>>>>>>>>>> 
>>>>>>>>>>> Now I submit one 16-core "openmp" PE job:
>>>>>>>>>>> 
>>>>>>>>>>> job-ID   name  user    state  queue             slots
>>>>>>>>>>> -----------------------------------------------------
>>>>>>>>>>>    2324  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2325  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2326  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2327  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2328  TEST  me      r      bio@compute-2-6  16
>>>>>>>>>>> 
>>>>>>>>>>> The scheduler should have picked compute-2-3 since it has 4
>>>>>>>>>>> cores_in_use,
>>>>>>>>>>> but instead, it picked compute-2-6 which had 0 cores_in_use.
>>>>>>>>>>> So here the
>>>>>>>>>>> scheduler is now behaving differently than with 1-core jobs.
>>>>>>>>>>> 
>>>>>>>>>>> As a further test I wait until my cores_in_use report back that
>>>>>>>>>>> compute2-6 has "16" cores in use.   I now submit another 16-core
>>>>>>>>>>> "openmp"
>>>>>>>>>>> job:
>>>>>>>>>>> 
>>>>>>>>>>> job-ID   name  user    state  queue             slots
>>>>>>>>>>> -----------------------------------------------------
>>>>>>>>>>>    2324  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2325  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2326  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2327  TEST  me      r      bio@compute-2-3   1
>>>>>>>>>>>    2328  TEST  me      r      bio@compute-2-6  16
>>>>>>>>>>>    2329  TEST  me      r      bio@compute-2-7  16
>>>>>>>>>>> 
>>>>>>>>>>> The schedule now picks yet a different node compute-2-7 which had 0
>>>>>>>>>>> cores_in_use.    I have tried this several times with many config
>>>>>>>>>>> changes to
>>>>>>>>>>> the scheduler and it sure looks like that the scheduler is *not*
>>>>>>>>>>> using the
>>>>>>>>>>> "load_formula" for PE jobs.   From what I can tell, the scheduler
>>>>>>>>>>> chooses
>>>>>>>>>>> nodes in random with PE jobs.
>>>>>>>>>>> 
>>>>>>>>>>> Here is my "openmp" PE:
>>>>>>>>>>> # qconf -sp openmp
>>>>>>>>>>> pe_name            openmp
>>>>>>>>>>> slots              9999
>>>>>>>>>>> user_lists         NONE
>>>>>>>>>>> xuser_lists        NONE
>>>>>>>>>>> start_proc_args    NONE
>>>>>>>>>>> stop_proc_args     NONE
>>>>>>>>>>> allocation_rule    $pe_slots
>>>>>>>>>>> control_slaves     TRUE
>>>>>>>>>>> job_is_first_task  FALSE
>>>>>>>>>>> urgency_slots      min
>>>>>>>>>>> accounting_summary TRUE
>>>>>>>>>>> 
>>>>>>>>>>> Here is my "bio" Q showing relevant info:
>>>>>>>>>>> 
>>>>>>>>>>> # qconf -sq bio | egrep "qname|slots|pe_list"
>>>>>>>>>>> qname                 bio
>>>>>>>>>>> pe_list               make mpi openmp
>>>>>>>>>>> slots                 64
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for taking a look at this!
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 8/11/2012 4:32 AM, Reuti wrote:
>>>>>>>>>>>> Am 11.08.2012 um 02:57 schrieb Joseph Farran<[email protected]>:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Reuti,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Are you sure this works in GE2011.11?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I have defined my own complex called "cores_in_use" which
>>>>>>>>>>>>> counts both
>>>>>>>>>>>>> single cores and PE cores correctly.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> It works great for single core jobs, but not for PE jobs using the
>>>>>>>>>>>>> "$pe_slots" allocation rule.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> # qconf -sp openmp
>>>>>>>>>>>>> pe_name            openmp
>>>>>>>>>>>>> slots              9999
>>>>>>>>>>>>> user_lists         NONE
>>>>>>>>>>>>> xuser_lists        NONE
>>>>>>>>>>>>> start_proc_args    NONE
>>>>>>>>>>>>> stop_proc_args     NONE
>>>>>>>>>>>>> allocation_rule    $pe_slots
>>>>>>>>>>>>> control_slaves     TRUE
>>>>>>>>>>>>> job_is_first_task  FALSE
>>>>>>>>>>>>> urgency_slots      min
>>>>>>>>>>>>> accounting_summary TRUE
>>>>>>>>>>>>> 
>>>>>>>>>>>>> # qconf -ssconf
>>>>>>>>>>>>> algorithm                         default
>>>>>>>>>>>>> schedule_interval                 0:0:15
>>>>>>>>>>>>> maxujobs                          0
>>>>>>>>>>>>> queue_sort_method                 seqno
>>>>>>>>>>>> The seq_no is the same for the queue instances in question?
>>>>>>>>>>>> 
>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>> 
>>>>>>>>>>>>> job_load_adjustments              cores_in_use=1
>>>>>>>>>>>>> load_adjustment_decay_time        0
>>>>>>>>>>>>> load_formula                      -cores_in_use
>>>>>>>>>>>>> schedd_job_info                   true
>>>>>>>>>>>>> flush_submit_sec                  5
>>>>>>>>>>>>> flush_finish_sec                  5
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I wait until the node reports the correct "cores_in_use"
>>>>>>>>>>>>> complex, I
>>>>>>>>>>>>> then submit a PE openmp job and it totally ignores the
>>>>>>>>>>>>> "load_formula" on the
>>>>>>>>>>>>> scheduler.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Joseph
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 08/09/2012 12:50 PM, Reuti wrote:
>>>>>>>>>>>>>> Correct. It uses the "allocation_rule" specified in the PE
>>>>>>>>>>>>>> instead.
>>>>>>>>>>>>>> Only for "allocation_rule" set to $PE_SLOTS it will also use the
>>>>>>>>>>>>>> "load_formula". Unfortunately there is nothing what you can do
>>>>>>>>>>>>>> to change the
>>>>>>>>>>>>>> behavior.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Am 09.08.2012 um 21:23 schrieb Joseph Farran<[email protected]>:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Howdy.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I am using GE2011.11.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I am successfully using GE "load_formula" to load jobs by
>>>>>>>>>>>>>>> core count
>>>>>>>>>>>>>>> using my own "load_sensor" script.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> All works as expected with single core jobs, however, for PE
>>>>>>>>>>>>>>> jobs, it
>>>>>>>>>>>>>>> seems as if GE does not abide by the "load_formula".
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Does the scheduler use a different "load" formula for single
>>>>>>>>>>>>>>> core
>>>>>>>>>>>>>>> jobs verses parallel jobs suing the PE environment setup?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Joseph
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] load_formula and PE jobs

Reply via email to