For those who had issues with earlier version, please try the latest
loadcheck v4:

http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html

I compiled the binary on Oracle Linux, which is compatible with RHEL
5.x, Scientific Linux or Centos 5.x. I tested the binary on the
standard Red Hat kernel, and Oracle enhanced "Unbreakable Enterprise
Kernel", Fedora 13, Ubuntu 10.04 LTS.

Rayson



On Thu, Apr 14, 2011 at 8:28 AM, Rayson Ho <[email protected]> wrote:
> Hi Chansup,
>
> I think I fixed it last night, and I uploaded the loadcheck binary and
> updated the page:
> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
>
> Or you can download it directly from:
> http://gridscheduler.sourceforge.net/projects/hwloc/loadcheckv2.tar.gz
>
> Again, thanks for the help guys!!
>
> Rayson
>
>
> On Wed, Apr 13, 2011 at 11:38 AM, Rayson Ho <[email protected]> wrote:
>> On Wed, Apr 13, 2011 at 9:14 AM, CB <[email protected]> wrote:
>>> The amount of sockets (total two) and cores  (total 24) of two 12-core
>>> magny-cour processor node is correct
>>
>> First of all, thanks Chansup, Ansgar, and Alex (who contacted me
>> offline) for testing the code!
>>
>> This is good, as the get_topology() code is correct, and hwloc is able
>> to handle the Magny-Cours topology.
>>
>>
>>>  but there is redundant and misleading description for interprocessor ids.
>>
>> This is in fact my bad, but I think I know how to fix it :-D
>>
>> I will let you guys know when I have the fix, and I will post the new
>> version on the Open Grid Scheduler project page.
>>
>> Again, many thanks!!
>>
>> Rayson
>>
>>
>>
>>>
>>> # ./loadcheck
>>> arch            lx26-amd64
>>> num_proc        24
>>> m_socket        2
>>> m_core          24
>>> m_topology      SCCCCCCCCCCCCSCCCCCCCCCCCC
>>> load_short      24.14
>>> load_medium     24.00
>>> load_long       22.36
>>> mem_free        31241.601562M
>>> swap_free       2047.992188M
>>> virtual_free    33289.593750M
>>> mem_total       64562.503906M
>>> swap_total      2047.992188M
>>> virtual_total   66610.496094M
>>> mem_used        33320.902344M
>>> swap_used       0.000000M
>>> virtual_used    33320.902344M
>>> cpu             100.0%
>>>
>>> # ./loadcheck -cb
>>> Your SGE Linux version has built-in core binding functionality!
>>> Your Linux kernel version is: 2.6.27.10-grsec
>>> Amount of sockets:              2
>>> Amount of cores:                24
>>> Topology:                       SCCCCCCCCCCCCSCCCCCCCCCCCC
>>> Mapping of logical socket and core numbers to internal
>>> Internal processor ids for socket     0 core     0:      0
>>> Internal processor ids for socket     0 core     1:      1
>>> Internal processor ids for socket     0 core     2:      2
>>> Internal processor ids for socket     0 core     3:      3
>>> Internal processor ids for socket     0 core     4:      4
>>> Internal processor ids for socket     0 core     5:      5
>>> Internal processor ids for socket     0 core     6:      6
>>> Internal processor ids for socket     0 core     7:      7
>>> Internal processor ids for socket     0 core     8:      8
>>> Internal processor ids for socket     0 core     9:      9
>>> Internal processor ids for socket     0 core    10:     10
>>> Internal processor ids for socket     0 core    11:     11
>>> Internal processor ids for socket     0 core    12:     12
>>> Internal processor ids for socket     0 core    13:     13
>>> Internal processor ids for socket     0 core    14:     14
>>> Internal processor ids for socket     0 core    15:     15
>>> Internal processor ids for socket     0 core    16:     16
>>> Internal processor ids for socket     0 core    17:     17
>>> Internal processor ids for socket     0 core    18:     18
>>> Internal processor ids for socket     0 core    19:     19
>>> Internal processor ids for socket     0 core    20:     20
>>> Internal processor ids for socket     0 core    21:     21
>>> Internal processor ids for socket     0 core    22:     22
>>> Internal processor ids for socket     0 core    23:     23
>>> Internal processor ids for socket     1 core     0:      0
>>> Internal processor ids for socket     1 core     1:      1
>>> Internal processor ids for socket     1 core     2:      2
>>> Internal processor ids for socket     1 core     3:      3
>>> Internal processor ids for socket     1 core     4:      4
>>> Internal processor ids for socket     1 core     5:      5
>>> Internal processor ids for socket     1 core     6:      6
>>> Internal processor ids for socket     1 core     7:      7
>>> Internal processor ids for socket     1 core     8:      8
>>> Internal processor ids for socket     1 core     9:      9
>>> Internal processor ids for socket     1 core    10:     10
>>> Internal processor ids for socket     1 core    11:     11
>>> Internal processor ids for socket     1 core    12:     12
>>> Internal processor ids for socket     1 core    13:     13
>>> Internal processor ids for socket     1 core    14:     14
>>> Internal processor ids for socket     1 core    15:     15
>>> Internal processor ids for socket     1 core    16:     16
>>> Internal processor ids for socket     1 core    17:     17
>>> Internal processor ids for socket     1 core    18:     18
>>> Internal processor ids for socket     1 core    19:     19
>>> Internal processor ids for socket     1 core    20:     20
>>> Internal processor ids for socket     1 core    21:     21
>>> Internal processor ids for socket     1 core    22:     22
>>> Internal processor ids for socket     1 core    23:     23
>>>
>>> I would expect the following:
>>> Mapping of logical socket and core numbers to internal
>>> Internal processor ids for socket     0 core     0:      0
>>> Internal processor ids for socket     0 core     1:      1
>>> Internal processor ids for socket     0 core     2:      2
>>> Internal processor ids for socket     0 core     3:      3
>>> Internal processor ids for socket     0 core     4:      4
>>> Internal processor ids for socket     0 core     5:      5
>>> Internal processor ids for socket     0 core     6:      6
>>> Internal processor ids for socket     0 core     7:      7
>>> Internal processor ids for socket     0 core     8:      8
>>> Internal processor ids for socket     0 core     9:      9
>>> Internal processor ids for socket     0 core    10:     10
>>> Internal processor ids for socket     0 core    11:     11
>>> Internal processor ids for socket     1 core    0:     12
>>> Internal processor ids for socket     1 core    1:     13
>>> Internal processor ids for socket     1 core    2:     14
>>> Internal processor ids for socket     1 core    3:     15
>>> Internal processor ids for socket     1 core    4:     16
>>> Internal processor ids for socket     1 core    5:     17
>>> Internal processor ids for socket     1 core    6:     18
>>> Internal processor ids for socket     1 core    7:     19
>>> Internal processor ids for socket     1 core    8:     20
>>> Internal processor ids for socket     1 core    9:     21
>>> Internal processor ids for socket     1 core    10:     22
>>> Internal processor ids for socket     1 core    11:     23
>>>
>>> Any comments?
>>>
>>> thanks,
>>> - Chansup
>>>
>>> On Tue, Apr 12, 2011 at 4:13 PM, Rayson Ho <[email protected]> wrote:
>>>> Ansgar,
>>>>
>>>> We are in the final stages of hwloc migration, please give our new
>>>> hwloc enabled loadcheck a try:
>>>>
>>>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
>>>>
>>>> Rayson
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Mar 14, 2011 at 11:11 AM, Esztermann, Ansgar
>>>> <[email protected]> wrote:
>>>>>
>>>>> On Mar 12, 2011, at 1:04 , Dave Love wrote:
>>>>>
>>>>>> "Esztermann, Ansgar" <[email protected]> writes:
>>>>>>
>>>>>>> Well, core IDs are unique only within the same socket ID (for older 
>>>>>>> CPUs, say Harpertown), so I would assume the same holds for node IDs -- 
>>>>>>> it's just that node IDs aren't displayed for Magny-Cours.
>>>>>>
>>>>>> What exactly would you expect?  hwloc's lstopo(1) gives the following
>>>>>> under current RedHat 5 (Linux 2.6.18-238.5.1.el5) on a Supermicro H8DGT
>>>>>> (Opteron 6134).  It seems to have the information exposed, but I'm not
>>>>>> sure how it should be.  (I guess GE should move to hwloc rather than
>>>>>> PLPA, which is now deprecated and not maintained.)
>>>>>>
>>>>>> Machine (63GB)
>>>>>>  Socket #0 (32GB)
>>>>>>    NUMANode #0 (phys=0 16GB) + L3 #0 (5118KB)
>>>>>>      L2 #0 (512KB) + L1 #0 (64KB) + Core #0 + PU #0 (phys=0)
>>>>>>      L2 #1 (512KB) + L1 #1 (64KB) + Core #1 + PU #1 (phys=1)
>>>>>>      L2 #2 (512KB) + L1 #2 (64KB) + Core #2 + PU #2 (phys=2)
>>>>>>      L2 #3 (512KB) + L1 #3 (64KB) + Core #3 + PU #3 (phys=3)
>>>>>>    NUMANode #1 (phys=1 16GB) + L3 #1 (5118KB)
>>>>>>      L2 #4 (512KB) + L1 #4 (64KB) + Core #4 + PU #4 (phys=4)
>>>>>>      L2 #5 (512KB) + L1 #5 (64KB) + Core #5 + PU #5 (phys=5)
>>>>>>      L2 #6 (512KB) + L1 #6 (64KB) + Core #6 + PU #6 (phys=6)
>>>>>>      L2 #7 (512KB) + L1 #7 (64KB) + Core #7 + PU #7 (phys=7)
>>>>> ...
>>>>>
>>>>> That's exactly what I'd expect...
>>>>> The interface at /sys/devices/system/cpu/cpuN/topology/ doesn't know 
>>>>> about NUMANodes, only about Sockets and cores. Thus, cores #0 and #4 in 
>>>>> the output above have the same core ID, and SGE interprets that as being 
>>>>> one core with two threads.
>>>>>
>>>>>
>>>>> A.
>>>>> --
>>>>> Ansgar Esztermann
>>>>> DV-Systemadministration
>>>>> Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> [email protected]
>>>> https://gridengine.org/mailman/listinfo/users
>>>>
>>>
>>
>

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to