El 27/06/14 13:12, Reuti escribió:
Am 27.06.2014 um 12:58 schrieb Txema Heredia:

El 27/06/14 12:32, Reuti escribió:
Am 27.06.2014 um 12:24 schrieb Txema Heredia:

El 27/06/14 11:31, Reuti escribió:
Hi,

Am 26.06.2014 um 17:56 schrieb Txema Heredia:

<snip>

# qstat -j 4561291 -cb | grep "job_name\|binding\|queue_list"
job_name:                   c0-1
hard_queue_list:            *@compute-0-1.local
binding:                    set linear:1:0,0
binding    1:               NONE

What I am missing here? What can be different in my nodes?
Does `qhost -F` output the fields:

$ qhost -F
...
    hl:m_topology=SC
    hl:m_topology_inuse=SC
    hl:m_socket=1.000000
    hl:m_core=1.000000

for this machine?

-- Reuti
Yes, qhost -F reports that for all nodes:

# qhost -F | grep "compute\|hl:m_"
compute-0-0             lx26-amd64     12  0.60   94.6G   10.1G 9.8G   53.8M
   hl:m_topology=SCCCCCCSCCCCCC
   hl:m_topology_inuse=SCCCCCCSCCCCCC
   hl:m_socket=2.000000
   hl:m_core=12.000000
compute-0-1             lx26-amd64     12  7.21   94.6G   14.9G 9.8G   86.6M
   hl:m_topology=SCCCCCCSCCCCCC
   hl:m_topology_inuse=ScCCCCCSCCCCCC
   hl:m_socket=2.000000
   hl:m_core=12.000000
...


But the inuse topology is blatantly wrong.
What version of SGE are you using? Maybe the "PLPA" which was used in former versions 
doesn't support this particular CPU's topology. It was replaced by "hwloc" later on.

-- Reuti

Originally it was SGE 6.2u5, but later on I substituted the sge_qmaster binary 
for OGS/GE 2011.11p1 (due to a problem with parallel jobs and -hold_jid
Well, the later should use "hwloc" AFAIK. But I have no clue which version was 
used and whether these types of CPUs are supported.

Are the machines in question brand new?

-- Reut
No, they all are IBM HS22 blades running 2x 6-core Intel Xeon E5645. This cluster has been running for 1.5~2 years with no hardware modifications.

I always thought that the core binding was working fine, but this proved me not. I am not really sure if it never worked or if something changed recently. We never did a comprehensive test of it.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to