El 27/06/14 13:12, Reuti escribió:
Am 27.06.2014 um 12:58 schrieb Txema Heredia:
El 27/06/14 12:32, Reuti escribió:
Am 27.06.2014 um 12:24 schrieb Txema Heredia:
El 27/06/14 11:31, Reuti escribió:
Hi,
Am 26.06.2014 um 17:56 schrieb Txema Heredia:
<snip>
# qstat -j 4561291 -cb | grep "job_name\|binding\|queue_list"
job_name: c0-1
hard_queue_list: *@compute-0-1.local
binding: set linear:1:0,0
binding 1: NONE
What I am missing here? What can be different in my nodes?
Does `qhost -F` output the fields:
$ qhost -F
...
hl:m_topology=SC
hl:m_topology_inuse=SC
hl:m_socket=1.000000
hl:m_core=1.000000
for this machine?
-- Reuti
Yes, qhost -F reports that for all nodes:
# qhost -F | grep "compute\|hl:m_"
compute-0-0 lx26-amd64 12 0.60 94.6G 10.1G 9.8G 53.8M
hl:m_topology=SCCCCCCSCCCCCC
hl:m_topology_inuse=SCCCCCCSCCCCCC
hl:m_socket=2.000000
hl:m_core=12.000000
compute-0-1 lx26-amd64 12 7.21 94.6G 14.9G 9.8G 86.6M
hl:m_topology=SCCCCCCSCCCCCC
hl:m_topology_inuse=ScCCCCCSCCCCCC
hl:m_socket=2.000000
hl:m_core=12.000000
...
But the inuse topology is blatantly wrong.
What version of SGE are you using? Maybe the "PLPA" which was used in former versions
doesn't support this particular CPU's topology. It was replaced by "hwloc" later on.
-- Reuti
Originally it was SGE 6.2u5, but later on I substituted the sge_qmaster binary
for OGS/GE 2011.11p1 (due to a problem with parallel jobs and -hold_jid
Well, the later should use "hwloc" AFAIK. But I have no clue which version was
used and whether these types of CPUs are supported.
Are the machines in question brand new?
-- Reut
No, they all are IBM HS22 blades running 2x 6-core Intel Xeon E5645.
This cluster has been running for 1.5~2 years with no hardware
modifications.
I always thought that the core binding was working fine, but this proved
me not. I am not really sure if it never worked or if something changed
recently. We never did a comprehensive test of it.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users