Le 28/07/2010 18:09, Bernd Kallies a écrit : > Is attached. I also checked for cpusets. I ran lstopo and > gather_topology from the root cpuset, which is the only cpuset and > contains cpus 0-767 and mems 0-47, that is - the whole machine. > > Background info: The UltraViolet architecture is new. There exists a > white paper about this at http://www.sgi.com/pdfs/4192.pdf > We have one UV rack, which is filled with 3/4 of the max. number of > blades. According to the specs, two NUMA nodes form one "blade". This > level corresponds to "Group0" in the hwloc topology. Two blades are > cross-linked via the NUMAlink, forming "paired nodes" = "Group1". What > "Group2" might correspond to - I don't know.
We group by distance, so it's look like there's something tagging these nodes as closer, and hwloc makes them a new group level > "Group3" corresponds to one > "chassis" or IRU. "Group4" may be an "enclosure", and "Machine" is the > "rack". > > From my opinion the hwloc topology for our machine should contain 2x > Group4. The 1st should contain 2x Group3, the 2nd one 1x Group3. lstopo > shows 1x Group4 containing 3x Group3, instead. > Actually no, but it's very hard to see :) lstopo - | egrep "(NUMA|Group)" shows that Group4#0 only contains Group3#0 and #1. Group3#2 is directly a child of the machine (the indentation is smaller). Open a *big* terminal window and look at the distance matrix: $ cat /sys/devices/system/node/node{?,??}/distance (I am not copy/pasting it here, it's too big :)) hwloc groups objects that have smaller distances and then compute distances between groups (average between distances of objects in each group). We get: Distance matrix between Group0 objects: 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 48 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 46 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 44 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 42 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 40 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 38 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 36 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 34 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 32 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 30 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 28 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 26 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 24 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 22 66 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 13 Between Group1: 17 24 28 32 36 40 44 48 52 56 60 64 24 17 24 28 32 36 40 44 48 52 56 60 28 24 17 24 28 32 36 40 44 48 52 56 32 28 24 17 24 28 32 36 40 44 48 52 36 32 28 24 17 24 28 32 36 40 44 48 40 36 32 28 24 17 24 28 32 36 40 44 44 40 36 32 28 24 17 24 28 32 36 40 48 44 40 36 32 28 24 17 24 28 32 36 52 48 44 40 36 32 28 24 17 24 28 32 56 52 48 44 40 36 32 28 24 17 24 28 60 56 52 48 44 40 36 32 28 24 17 24 64 60 56 52 48 44 40 36 32 28 24 17 Group2: 20 28 36 44 52 60 28 20 28 36 44 52 36 28 20 28 36 44 44 36 28 20 28 36 52 44 36 28 20 28 60 52 44 36 28 20 Group3: 24 36 52 36 24 36 52 36 24 The way I am reading this is: IRU#1 is close to IRU#0 and #2, but #0 and #2 are far away for each other. Then I don't think we can group 2 IRU and keep a third one on the side as you said. How would you group these? That said, something is going wrong with the grouping code. Right now, it should keep 3 Group3 under the machine. I am looking at it. Brice