Hi Chansup, I think I fixed it last night, and I uploaded the loadcheck binary and updated the page: http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
Or you can download it directly from: http://gridscheduler.sourceforge.net/projects/hwloc/loadcheckv2.tar.gz Again, thanks for the help guys!! Rayson On Wed, Apr 13, 2011 at 11:38 AM, Rayson Ho <[email protected]> wrote: > On Wed, Apr 13, 2011 at 9:14 AM, CB <[email protected]> wrote: >> The amount of sockets (total two) and cores (total 24) of two 12-core >> magny-cour processor node is correct > > First of all, thanks Chansup, Ansgar, and Alex (who contacted me > offline) for testing the code! > > This is good, as the get_topology() code is correct, and hwloc is able > to handle the Magny-Cours topology. > > >> but there is redundant and misleading description for interprocessor ids. > > This is in fact my bad, but I think I know how to fix it :-D > > I will let you guys know when I have the fix, and I will post the new > version on the Open Grid Scheduler project page. > > Again, many thanks!! > > Rayson > > > >> >> # ./loadcheck >> arch lx26-amd64 >> num_proc 24 >> m_socket 2 >> m_core 24 >> m_topology SCCCCCCCCCCCCSCCCCCCCCCCCC >> load_short 24.14 >> load_medium 24.00 >> load_long 22.36 >> mem_free 31241.601562M >> swap_free 2047.992188M >> virtual_free 33289.593750M >> mem_total 64562.503906M >> swap_total 2047.992188M >> virtual_total 66610.496094M >> mem_used 33320.902344M >> swap_used 0.000000M >> virtual_used 33320.902344M >> cpu 100.0% >> >> # ./loadcheck -cb >> Your SGE Linux version has built-in core binding functionality! >> Your Linux kernel version is: 2.6.27.10-grsec >> Amount of sockets: 2 >> Amount of cores: 24 >> Topology: SCCCCCCCCCCCCSCCCCCCCCCCCC >> Mapping of logical socket and core numbers to internal >> Internal processor ids for socket 0 core 0: 0 >> Internal processor ids for socket 0 core 1: 1 >> Internal processor ids for socket 0 core 2: 2 >> Internal processor ids for socket 0 core 3: 3 >> Internal processor ids for socket 0 core 4: 4 >> Internal processor ids for socket 0 core 5: 5 >> Internal processor ids for socket 0 core 6: 6 >> Internal processor ids for socket 0 core 7: 7 >> Internal processor ids for socket 0 core 8: 8 >> Internal processor ids for socket 0 core 9: 9 >> Internal processor ids for socket 0 core 10: 10 >> Internal processor ids for socket 0 core 11: 11 >> Internal processor ids for socket 0 core 12: 12 >> Internal processor ids for socket 0 core 13: 13 >> Internal processor ids for socket 0 core 14: 14 >> Internal processor ids for socket 0 core 15: 15 >> Internal processor ids for socket 0 core 16: 16 >> Internal processor ids for socket 0 core 17: 17 >> Internal processor ids for socket 0 core 18: 18 >> Internal processor ids for socket 0 core 19: 19 >> Internal processor ids for socket 0 core 20: 20 >> Internal processor ids for socket 0 core 21: 21 >> Internal processor ids for socket 0 core 22: 22 >> Internal processor ids for socket 0 core 23: 23 >> Internal processor ids for socket 1 core 0: 0 >> Internal processor ids for socket 1 core 1: 1 >> Internal processor ids for socket 1 core 2: 2 >> Internal processor ids for socket 1 core 3: 3 >> Internal processor ids for socket 1 core 4: 4 >> Internal processor ids for socket 1 core 5: 5 >> Internal processor ids for socket 1 core 6: 6 >> Internal processor ids for socket 1 core 7: 7 >> Internal processor ids for socket 1 core 8: 8 >> Internal processor ids for socket 1 core 9: 9 >> Internal processor ids for socket 1 core 10: 10 >> Internal processor ids for socket 1 core 11: 11 >> Internal processor ids for socket 1 core 12: 12 >> Internal processor ids for socket 1 core 13: 13 >> Internal processor ids for socket 1 core 14: 14 >> Internal processor ids for socket 1 core 15: 15 >> Internal processor ids for socket 1 core 16: 16 >> Internal processor ids for socket 1 core 17: 17 >> Internal processor ids for socket 1 core 18: 18 >> Internal processor ids for socket 1 core 19: 19 >> Internal processor ids for socket 1 core 20: 20 >> Internal processor ids for socket 1 core 21: 21 >> Internal processor ids for socket 1 core 22: 22 >> Internal processor ids for socket 1 core 23: 23 >> >> I would expect the following: >> Mapping of logical socket and core numbers to internal >> Internal processor ids for socket 0 core 0: 0 >> Internal processor ids for socket 0 core 1: 1 >> Internal processor ids for socket 0 core 2: 2 >> Internal processor ids for socket 0 core 3: 3 >> Internal processor ids for socket 0 core 4: 4 >> Internal processor ids for socket 0 core 5: 5 >> Internal processor ids for socket 0 core 6: 6 >> Internal processor ids for socket 0 core 7: 7 >> Internal processor ids for socket 0 core 8: 8 >> Internal processor ids for socket 0 core 9: 9 >> Internal processor ids for socket 0 core 10: 10 >> Internal processor ids for socket 0 core 11: 11 >> Internal processor ids for socket 1 core 0: 12 >> Internal processor ids for socket 1 core 1: 13 >> Internal processor ids for socket 1 core 2: 14 >> Internal processor ids for socket 1 core 3: 15 >> Internal processor ids for socket 1 core 4: 16 >> Internal processor ids for socket 1 core 5: 17 >> Internal processor ids for socket 1 core 6: 18 >> Internal processor ids for socket 1 core 7: 19 >> Internal processor ids for socket 1 core 8: 20 >> Internal processor ids for socket 1 core 9: 21 >> Internal processor ids for socket 1 core 10: 22 >> Internal processor ids for socket 1 core 11: 23 >> >> Any comments? >> >> thanks, >> - Chansup >> >> On Tue, Apr 12, 2011 at 4:13 PM, Rayson Ho <[email protected]> wrote: >>> Ansgar, >>> >>> We are in the final stages of hwloc migration, please give our new >>> hwloc enabled loadcheck a try: >>> >>> http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html >>> >>> Rayson >>> >>> >>> >>> >>> On Mon, Mar 14, 2011 at 11:11 AM, Esztermann, Ansgar >>> <[email protected]> wrote: >>>> >>>> On Mar 12, 2011, at 1:04 , Dave Love wrote: >>>> >>>>> "Esztermann, Ansgar" <[email protected]> writes: >>>>> >>>>>> Well, core IDs are unique only within the same socket ID (for older >>>>>> CPUs, say Harpertown), so I would assume the same holds for node IDs -- >>>>>> it's just that node IDs aren't displayed for Magny-Cours. >>>>> >>>>> What exactly would you expect? hwloc's lstopo(1) gives the following >>>>> under current RedHat 5 (Linux 2.6.18-238.5.1.el5) on a Supermicro H8DGT >>>>> (Opteron 6134). It seems to have the information exposed, but I'm not >>>>> sure how it should be. (I guess GE should move to hwloc rather than >>>>> PLPA, which is now deprecated and not maintained.) >>>>> >>>>> Machine (63GB) >>>>> Socket #0 (32GB) >>>>> NUMANode #0 (phys=0 16GB) + L3 #0 (5118KB) >>>>> L2 #0 (512KB) + L1 #0 (64KB) + Core #0 + PU #0 (phys=0) >>>>> L2 #1 (512KB) + L1 #1 (64KB) + Core #1 + PU #1 (phys=1) >>>>> L2 #2 (512KB) + L1 #2 (64KB) + Core #2 + PU #2 (phys=2) >>>>> L2 #3 (512KB) + L1 #3 (64KB) + Core #3 + PU #3 (phys=3) >>>>> NUMANode #1 (phys=1 16GB) + L3 #1 (5118KB) >>>>> L2 #4 (512KB) + L1 #4 (64KB) + Core #4 + PU #4 (phys=4) >>>>> L2 #5 (512KB) + L1 #5 (64KB) + Core #5 + PU #5 (phys=5) >>>>> L2 #6 (512KB) + L1 #6 (64KB) + Core #6 + PU #6 (phys=6) >>>>> L2 #7 (512KB) + L1 #7 (64KB) + Core #7 + PU #7 (phys=7) >>>> ... >>>> >>>> That's exactly what I'd expect... >>>> The interface at /sys/devices/system/cpu/cpuN/topology/ doesn't know about >>>> NUMANodes, only about Sockets and cores. Thus, cores #0 and #4 in the >>>> output above have the same core ID, and SGE interprets that as being one >>>> core with two threads. >>>> >>>> >>>> A. >>>> -- >>>> Ansgar Esztermann >>>> DV-Systemadministration >>>> Max-Planck-Institut für biophysikalische Chemie, Abteilung 105 >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >>>> >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >>> >> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
