Dang! You are right!

The "incoherence" among jobs is due to the first core of the first socket being available. On my previous socket report, all "linear X:0,0" that were correctly reported were only the ones that could start in the first core.

I have just modified my jsv to set the policy to linear_automatic, and now it works fine!

Being the nodes:

compute-1-8             lx26-amd64     12  6.95   94.6G   32.3G 9.8G   39.4M
   hl:m_topology_inuse=SccccccSCCCCCC
binding:                    set linear:6:0,0
binding    1:               SccccccSCCCCCC
binding:                    set linear:1:0,0
binding    1:               NONE

compute-1-9             lx26-amd64     12  0.01   94.6G   10.1G 9.8G   39.0M
   hl:m_topology_inuse=SCCCCCCSCCCCCC

compute-1-8 has the 1st core already bound, and compute-1-9 has it free.

I submit several single-core qlogin to both nodes:

(compute-1-8)
[root@floquet ~]# qstat -j 4564595 -cb | grep binding
binding:                    set linear:1:0,0
binding    1:               NONE
(compute-1-9)
[root@floquet ~]# qstat -j 4564594 -cb | grep binding
binding:                    set linear:1:0,0
binding    1:               ScCCCCCSCCCCCC


Now I change the policy to linear_automatic and:

(compute-1-8)
[root@floquet ~]# qstat -j 4564597 -cb | grep binding
binding:                    set linear:1
binding    1:               SCCCCCCScCCCCC
(compute-1-9)
[root@floquet ~]# qstat -j 4564596 -cb | grep binding
binding:                    set linear:1
binding    1:               ScCCCCCSCCCCCC


Thanks!!

Txema

El 27/06/14 13:19, Daniel Gruber escribió:
Hi,

Please notice the difference between "set linear:1:0,0“ and
"set linear:1“. The first one means - give me one core starting
at socket 0 core 0 (which means here obviously you are
requesting core 0 on socket 0). The second means that
you want one core on the host and the execution daemon
takes care which one.

So per design the core selection is done on the execd in SGE -
while in Univa Grid Engine we moved that to the qmaster
itself (which has many advantages due to the global
view of the cluster / job and core usage).

If now the execd in your case tries to bind the job it figures
out that a different job already uses this core and therefore
SGE just don’t do any binding for the job (in order to avoid
overallocation).

I guess your linear:1:0,0 request is not by intention - it does
only make sense in scenarios where you are using your
host exclusively for one job.

This is probably caused by your JSV script - which sets binding_strategy
to „linear“ (linear:X:S,C) instead of „linear_automatic“ (linear:X). Obviously
the naming of the JSV parameter argument is unfortunate.

Might this be the reason?

Cheers

Daniel


Am 27.06.2014 um 12:58 schrieb Txema Heredia <[email protected] <mailto:[email protected]>>:

El 27/06/14 12:32, Reuti escribió:
Am 27.06.2014 um 12:24 schrieb Txema Heredia:

El 27/06/14 11:31, Reuti escribió:
Hi,

Am 26.06.2014 um 17:56 schrieb Txema Heredia:

<snip>

# qstat -j 4561291 -cb | grep "job_name\|binding\|queue_list"
job_name:                   c0-1
hard_queue_list:            *@compute-0-1.local
binding:                    set linear:1:0,0
binding    1:               NONE

What I am missing here? What can be different in my nodes?
Does `qhost -F` output the fields:

$ qhost -F
...
   hl:m_topology=SC
   hl:m_topology_inuse=SC
   hl:m_socket=1.000000
   hl:m_core=1.000000

for this machine?

-- Reuti
Yes, qhost -F reports that for all nodes:

# qhost -F | grep "compute\|hl:m_"
compute-0-0 lx26-amd64 12 0.60 94.6G 10.1G 9.8G 53.8M
  hl:m_topology=SCCCCCCSCCCCCC
  hl:m_topology_inuse=SCCCCCCSCCCCCC
  hl:m_socket=2.000000
  hl:m_core=12.000000
compute-0-1 lx26-amd64 12 7.21 94.6G 14.9G 9.8G 86.6M
  hl:m_topology=SCCCCCCSCCCCCC
  hl:m_topology_inuse=ScCCCCCSCCCCCC
  hl:m_socket=2.000000
  hl:m_core=12.000000
...


But the inuse topology is blatantly wrong.
What version of SGE are you using? Maybe the "PLPA" which was used in former versions doesn't support this particular CPU's topology. It was replaced by "hwloc" later on.

-- Reuti

Originally it was SGE 6.2u5, but later on I substituted the sge_qmaster binary for OGS/GE 2011.11p1 (due to a problem with parallel jobs and -hold_jid)


_______________________________________________
users mailing list
[email protected] <mailto:[email protected]>
https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to