For those who had issues with earlier version, please try the latest loadcheck v4:
http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html I compiled the binary on Oracle Linux, which is compatible with RHEL 5.x, Scientific Linux or Centos 5.x. I tested the binary on the standard Red Hat kernel, and Oracle enhanced "Unbreakable Enterprise Kernel", Fedora 13, Ubuntu 10.04 LTS. Rayson On Thu, Apr 14, 2011 at 8:28 AM, Rayson Ho <[email protected]> wrote: > Hi Chansup, > > I think I fixed it last night, and I uploaded the loadcheck binary and > updated the page: > http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html > > Or you can download it directly from: > http://gridscheduler.sourceforge.net/projects/hwloc/loadcheckv2.tar.gz > > Again, thanks for the help guys!! > > Rayson > > > On Wed, Apr 13, 2011 at 11:38 AM, Rayson Ho <[email protected]> wrote: >> On Wed, Apr 13, 2011 at 9:14 AM, CB <[email protected]> wrote: >>> The amount of sockets (total two) and cores (total 24) of two 12-core >>> magny-cour processor node is correct >> >> First of all, thanks Chansup, Ansgar, and Alex (who contacted me >> offline) for testing the code! >> >> This is good, as the get_topology() code is correct, and hwloc is able >> to handle the Magny-Cours topology. >> >> >>> but there is redundant and misleading description for interprocessor ids. >> >> This is in fact my bad, but I think I know how to fix it :-D >> >> I will let you guys know when I have the fix, and I will post the new >> version on the Open Grid Scheduler project page. >> >> Again, many thanks!! >> >> Rayson >> >> >> >>> >>> # ./loadcheck >>> arch lx26-amd64 >>> num_proc 24 >>> m_socket 2 >>> m_core 24 >>> m_topology SCCCCCCCCCCCCSCCCCCCCCCCCC >>> load_short 24.14 >>> load_medium 24.00 >>> load_long 22.36 >>> mem_free 31241.601562M >>> swap_free 2047.992188M >>> virtual_free 33289.593750M >>> mem_total 64562.503906M >>> swap_total 2047.992188M >>> virtual_total 66610.496094M >>> mem_used 33320.902344M >>> swap_used 0.000000M >>> virtual_used 33320.902344M >>> cpu 100.0% >>> >>> # ./loadcheck -cb >>> Your SGE Linux version has built-in core binding functionality! >>> Your Linux kernel version is: 2.6.27.10-grsec >>> Amount of sockets: 2 >>> Amount of cores: 24 >>> Topology: SCCCCCCCCCCCCSCCCCCCCCCCCC >>> Mapping of logical socket and core numbers to internal >>> Internal processor ids for socket 0 core 0: 0 >>> Internal processor ids for socket 0 core 1: 1 >>> Internal processor ids for socket 0 core 2: 2 >>> Internal processor ids for socket 0 core 3: 3 >>> Internal processor ids for socket 0 core 4: 4 >>> Internal processor ids for socket 0 core 5: 5 >>> Internal processor ids for socket 0 core 6: 6 >>> Internal processor ids for socket 0 core 7: 7 >>> Internal processor ids for socket 0 core 8: 8 >>> Internal processor ids for socket 0 core 9: 9 >>> Internal processor ids for socket 0 core 10: 10 >>> Internal processor ids for socket 0 core 11: 11 >>> Internal processor ids for socket 0 core 12: 12 >>> Internal processor ids for socket 0 core 13: 13 >>> Internal processor ids for socket 0 core 14: 14 >>> Internal processor ids for socket 0 core 15: 15 >>> Internal processor ids for socket 0 core 16: 16 >>> Internal processor ids for socket 0 core 17: 17 >>> Internal processor ids for socket 0 core 18: 18 >>> Internal processor ids for socket 0 core 19: 19 >>> Internal processor ids for socket 0 core 20: 20 >>> Internal processor ids for socket 0 core 21: 21 >>> Internal processor ids for socket 0 core 22: 22 >>> Internal processor ids for socket 0 core 23: 23 >>> Internal processor ids for socket 1 core 0: 0 >>> Internal processor ids for socket 1 core 1: 1 >>> Internal processor ids for socket 1 core 2: 2 >>> Internal processor ids for socket 1 core 3: 3 >>> Internal processor ids for socket 1 core 4: 4 >>> Internal processor ids for socket 1 core 5: 5 >>> Internal processor ids for socket 1 core 6: 6 >>> Internal processor ids for socket 1 core 7: 7 >>> Internal processor ids for socket 1 core 8: 8 >>> Internal processor ids for socket 1 core 9: 9 >>> Internal processor ids for socket 1 core 10: 10 >>> Internal processor ids for socket 1 core 11: 11 >>> Internal processor ids for socket 1 core 12: 12 >>> Internal processor ids for socket 1 core 13: 13 >>> Internal processor ids for socket 1 core 14: 14 >>> Internal processor ids for socket 1 core 15: 15 >>> Internal processor ids for socket 1 core 16: 16 >>> Internal processor ids for socket 1 core 17: 17 >>> Internal processor ids for socket 1 core 18: 18 >>> Internal processor ids for socket 1 core 19: 19 >>> Internal processor ids for socket 1 core 20: 20 >>> Internal processor ids for socket 1 core 21: 21 >>> Internal processor ids for socket 1 core 22: 22 >>> Internal processor ids for socket 1 core 23: 23 >>> >>> I would expect the following: >>> Mapping of logical socket and core numbers to internal >>> Internal processor ids for socket 0 core 0: 0 >>> Internal processor ids for socket 0 core 1: 1 >>> Internal processor ids for socket 0 core 2: 2 >>> Internal processor ids for socket 0 core 3: 3 >>> Internal processor ids for socket 0 core 4: 4 >>> Internal processor ids for socket 0 core 5: 5 >>> Internal processor ids for socket 0 core 6: 6 >>> Internal processor ids for socket 0 core 7: 7 >>> Internal processor ids for socket 0 core 8: 8 >>> Internal processor ids for socket 0 core 9: 9 >>> Internal processor ids for socket 0 core 10: 10 >>> Internal processor ids for socket 0 core 11: 11 >>> Internal processor ids for socket 1 core 0: 12 >>> Internal processor ids for socket 1 core 1: 13 >>> Internal processor ids for socket 1 core 2: 14 >>> Internal processor ids for socket 1 core 3: 15 >>> Internal processor ids for socket 1 core 4: 16 >>> Internal processor ids for socket 1 core 5: 17 >>> Internal processor ids for socket 1 core 6: 18 >>> Internal processor ids for socket 1 core 7: 19 >>> Internal processor ids for socket 1 core 8: 20 >>> Internal processor ids for socket 1 core 9: 21 >>> Internal processor ids for socket 1 core 10: 22 >>> Internal processor ids for socket 1 core 11: 23 >>> >>> Any comments? >>> >>> thanks, >>> - Chansup >>> >>> On Tue, Apr 12, 2011 at 4:13 PM, Rayson Ho <[email protected]> wrote: >>>> Ansgar, >>>> >>>> We are in the final stages of hwloc migration, please give our new >>>> hwloc enabled loadcheck a try: >>>> >>>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html >>>> >>>> Rayson >>>> >>>> >>>> >>>> >>>> On Mon, Mar 14, 2011 at 11:11 AM, Esztermann, Ansgar >>>> <[email protected]> wrote: >>>>> >>>>> On Mar 12, 2011, at 1:04 , Dave Love wrote: >>>>> >>>>>> "Esztermann, Ansgar" <[email protected]> writes: >>>>>> >>>>>>> Well, core IDs are unique only within the same socket ID (for older >>>>>>> CPUs, say Harpertown), so I would assume the same holds for node IDs -- >>>>>>> it's just that node IDs aren't displayed for Magny-Cours. >>>>>> >>>>>> What exactly would you expect? hwloc's lstopo(1) gives the following >>>>>> under current RedHat 5 (Linux 2.6.18-238.5.1.el5) on a Supermicro H8DGT >>>>>> (Opteron 6134). It seems to have the information exposed, but I'm not >>>>>> sure how it should be. (I guess GE should move to hwloc rather than >>>>>> PLPA, which is now deprecated and not maintained.) >>>>>> >>>>>> Machine (63GB) >>>>>> Socket #0 (32GB) >>>>>> NUMANode #0 (phys=0 16GB) + L3 #0 (5118KB) >>>>>> L2 #0 (512KB) + L1 #0 (64KB) + Core #0 + PU #0 (phys=0) >>>>>> L2 #1 (512KB) + L1 #1 (64KB) + Core #1 + PU #1 (phys=1) >>>>>> L2 #2 (512KB) + L1 #2 (64KB) + Core #2 + PU #2 (phys=2) >>>>>> L2 #3 (512KB) + L1 #3 (64KB) + Core #3 + PU #3 (phys=3) >>>>>> NUMANode #1 (phys=1 16GB) + L3 #1 (5118KB) >>>>>> L2 #4 (512KB) + L1 #4 (64KB) + Core #4 + PU #4 (phys=4) >>>>>> L2 #5 (512KB) + L1 #5 (64KB) + Core #5 + PU #5 (phys=5) >>>>>> L2 #6 (512KB) + L1 #6 (64KB) + Core #6 + PU #6 (phys=6) >>>>>> L2 #7 (512KB) + L1 #7 (64KB) + Core #7 + PU #7 (phys=7) >>>>> ... >>>>> >>>>> That's exactly what I'd expect... >>>>> The interface at /sys/devices/system/cpu/cpuN/topology/ doesn't know >>>>> about NUMANodes, only about Sockets and cores. Thus, cores #0 and #4 in >>>>> the output above have the same core ID, and SGE interprets that as being >>>>> one core with two threads. >>>>> >>>>> >>>>> A. >>>>> -- >>>>> Ansgar Esztermann >>>>> DV-Systemadministration >>>>> Max-Planck-Institut für biophysikalische Chemie, Abteilung 105 >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> https://gridengine.org/mailman/listinfo/users >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >>>> >>> >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
