On 08/03/2012 09:57 AM, Reuti wrote:
Am 03.08.2012 um 18:50 schrieb Joseph Farran:
On 08/03/2012 09:18 AM, Reuti wrote:
Am 03.08.2012 um 18:04 schrieb Joseph Farran:
I pack jobs unto nodes using the following GE setup:
# qconf -ssconf | egrep "queue|load"
queue_sort_method seqno
job_load_adjustments NONE
load_adjustment_decay_time 0
load_formula slots
I also set my nodes with the slots complex value:
# qconf -rattr exechost complex_values "slots=64" compute-2-1
Don't limit it here. Just define 64 in both queues for slots.
Yes, I tried that approached as well but then parallel jobs will not suspend
equal number of serial jobs.
So after I setup the above ( note my test queue and nodes have 8 cores and not
64 ):
# qconf -sq owner | egrep "slots"
slots 8
subordinate_list slots=8(free:0:sr)
# qconf -sq free | egrep "slots"
slots 8
[# qconf -se compute-3-1 | egrep complex
complex_values NONE
# qconf -se compute-3-2 | egrep complex
complex_values NONE
When I submit one 8 parallel job to owner, only one core in free is suspended
instead of 8:
Here is qstat listing:
job-ID prior name user state queue slots
--------------------------------------------------------------
8531 0.50500 FREE testfree r free@compute-3-1 1
8532 0.50500 FREE testfree r free@compute-3-1 1
8533 0.50500 FREE testfree r free@compute-3-1 1
8534 0.50500 FREE testfree r free@compute-3-1 1
8535 0.50500 FREE testfree r free@compute-3-1 1
8536 0.50500 FREE testfree r free@compute-3-1 1
8537 0.50500 FREE testfree r free@compute-3-1 1
8538 0.50500 FREE testfree S free@compute-3-1 1
8539 0.50500 FREE testfree r free@compute-3-2 1
8540 0.50500 FREE testfree r free@compute-3-2 1
8541 0.50500 FREE testfree r free@compute-3-2 1
8542 0.50500 FREE testfree r free@compute-3-2 1
8543 0.50500 FREE testfree r free@compute-3-2 1
8544 0.50500 FREE testfree r free@compute-3-2 1
8545 0.50500 FREE testfree r free@compute-3-2 1
8546 0.50500 FREE testfree r free@compute-3-2 1
8547 0.60500 Owner me r owner@compute-3-1 8
Job 8547 on owner queue starts just fine running with 8 cores on compute-3-1
*but* only one core in compute-3-1 from free queue is suspended instead of 8
cores.
AFAIR this is a known bug for parallel jobs.
So the answer to my original question is that no, it cannot be done.
Is there another open source GE flavor that has fixed this bug, or is this bug
across all open source GE flavors?
Serial jobs are all packed nicely unto a node until the node is full and then
it goes unto the next node.
The issue I am having is that my subordinate queue breaks when I have set my
nodes with the node complex value above.
I have two queues: The owner queue and the free queue:
# qconf -sq owner | egrep "subordinate|shell"
shell /bin/bash
shell_start_mode posix_compliant
subordinate_list free=1
subordinate_list slots=64(free)
# qconf -sq free | egrep "subordinate|shell"
shell /bin/bash
shell_start_mode posix_compliant
subordinate_list NONE
When I fill up the free queue with serial jobs and I then submit a job to the
owner queue, the owner job will not suspend the free job. Qstat scheduling
info says:
queue instance "[email protected]" dropped because it is full
queue instance "[email protected]" dropped because it is full
If I remove the "complex_values=" from my nodes, then jobs are correctly
suspended in free queue and the owner job runs just fine.
Yes, and what's the problem with this setup?
What is wrong with the above setup is that the 'owner' cannot run because free
jobs are not suspended.
They are not suspended in advance. The suspension is the result of an
additional job being started thereon. Not the other way round.
Right, but the idea of a subordinate queue ( job preemption ) is that when a
job *IS* scheduled, that the subordinate queue suspend jobs. I mean, that's
the whole idea.
-- Reuti
So how can I accomplish both items above?
*** By the way, here are some pre-answers to some questions I am going to be
asked:
Why pack jobs?: Because in any HPC environment that runs a mixture of serial
and parallel jobs, you really don't want to spread single core jobs across
multiple nodes, specially 64 cores nodes. You want to keep nodes whole for
parallel jobs ( this is HPC 101 ).
Depends on the application. E.g. Molcas is writing a lot to the local scratch
disk, so it's better to spread them in the cluster and use the remaining cores
in each exechost for jobs without or at least with less disk access.
Yes, there will always be exceptions. I should have said in most 99% of
circumstances.
-- Reuti
Suspended jobs will not free up resources: Yeap, but the jobs will *not* be
consuming CPU cycles which is what I want.
Thanks,
Joseph
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users