Re: [hwloc-devel] hwloc-bind syntax
On Dec 4, 2009, at 5:36 AM, Brice Goglin wrote: > > It might be good to safely ignore 0x if it's present, but that's a small > > feature enhancement that can be done at any time (I filed a future ticket). > > It seems to work actually :) Hmm -- I don't think so...? "0x1" can't pass this test in hwloc_mask_process_arg(): } else if (strlen(arg) == strspn(arg, "0123456789abcdefABCDEF,")) { In my tests, it's falling through to the "err = -1" case, but just not printing out an error. Even more fun -- note the lack of error shown, and the lack of "ls" output, except for when we specify -v: [8:33] rtp-jsquyres-8711:~/svn/hwloc % ./utils/hwloc-bind 0x1 ls [8:33] rtp-jsquyres-8711:~/svn/hwloc % ./utils/hwloc-bind -v 0x1 ls assuming the command starts at 0x1 execvp: No such file or directory - If think that if execvp() fails, we should *always* print an error, not just if -v was specified. I'll fix. > > Linux is likely to be among the most popular target for hwloc -- so can you > > explain in good words definitions for the following: [snipped] Thanks. > > Additionally -- the word "father" is used in the docs. Should we use the > > gender-neutral "parent" instead? > > I am not sure. The object structure contains a father pointer. We use > parent in the API, but it might refer to different things, like father, > grandfather, ... FWIW, the english word "parent" definitely refers to the immediate ancestor. It does *not* refer to grandparents or great-grandparents, etc. > > What I meant by my question was -- aren't the 3 diagrams above equivalent > > to "core:6"? If so, what's the value of the foo.bar.baz notation? > > If you have a 96 core machine like we do, the hierarchical notation > (foo.bar.baz) is really nice. If I want to bind on > node:2.socket:3.core:4, it's much easier than looking at the topology > and finding that it's core:70. Ah, ok. Fair enough. > Using physical or logical indexes doesn't > change anything here. I agree that we don't do that often in real > applications, but I actually use that quite a lot for my own debugging :) Another good reason. :-) > I actually don't see why people would like to use physical numbers in > such a hierarchical notation since physical socket/core numbers are > often strange/illogical and nobody remembers them. However, I agree that > the physical indexes are useful when *not* using a hierarchical > notation, ie I want to bind on thread OS index #46. As a server vendor, using physical/OS indexes is actually quite useful to me (e.g., to ensure that the hardware and OS are playing nicely). My point is that everyone has a different view here -- we should just support both. IMHO, the common case is logical indexes -- so let's make those the default. But there are definitely cases where physical indexes are useful as well. -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] hwloc-bind syntax
On Dec 4, 2009, at 5:32 AM, Ashley Pittman wrote: > > It might be good to safely ignore 0x if it's present, but that's a small > > feature enhancement that can be done at any time (I filed a future ticket). > > Maybe not relevant but it bit me so I'll say it here, using "%x" with > sscanf on a string of "0x1" will match the whole thing and give a value > of 1 on Linux but on Solaris it'll match the "0" as a hex value of 0 and > not match the "x1" at all leading to further errors in subsequent > matches as well. The most annoying thing is that sscanf() thinks it's > matched and it's return code will be set accordingly. Yuck! Thankfully, we don't appear to be using sscanf() to convert the cpuset strings. -- Jeff Squyres jsquy...@cisco.com
Re: [hwloc-devel] hwloc-bind syntax
Jeff Squyres wrote: > It might be good to safely ignore 0x if it's present, but that's a small > feature enhancement that can be done at any time (I filed a future ticket). > It seems to work actually :) >> We might want to drop the Linux "cpuset" word and use "cgroup" instead. >> Both are supported by Linux, but the latter now contains the former and >> more, so people are supposed to use cgroup now. hwloc supports both. >> > > Linux is likely to be among the most popular target for hwloc -- so can you > explain in good words definitions for the following: > > - hwloc cpuset > Opaque structure describing a set of logical processors. Each hwloc object structure contains a cpuset field that describes which logical processors are contained in the corresponding physical object. hwloc cpusets are used by hwloc binding routines. > - Linux cpuset > - Linux cgroup > See http://www.mjmwired.net/kernel/Documentation/cgroups.txt, and look for cpusets in there: Control Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour. [...] On their own, the only use for cgroups is for simple job tracking. The intention is that other subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as accounting/limiting the resources which processes in a cgroup can access. For example, cpusets allows you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup. > Additionally -- the word "father" is used in the docs. Should we use the > gender-neutral "parent" instead? > I am not sure. The object structure contains a father pointer. We use parent in the API, but it might refer to different things, like father, grandfather, ... >> You don't care about starting with system or something else. You can >> ignore the system level as you could ignore the socket level between >> nodes and cores. >> >> If you have 1 system with 2 nodes with 2 sockets each with 2 cores each, >> you get: >> node:1 core:2 is equivalent to system:0 node:1 socket:2 core:0 and >> equivalent to system:0 core:6 >> > > Did you mean: > > node:1.core:2 == system:0.node:1.socket:2.core:0 == system:0.core:6 > > ? > Yes. > What I meant by my question was -- aren't the 3 diagrams above equivalent to > "core:6"? If so, what's the value of the foo.bar.baz notation? If you have a 96 core machine like we do, the hierarchical notation (foo.bar.baz) is really nice. If I want to bind on node:2.socket:3.core:4, it's much easier than looking at the topology and finding that it's core:70. Using physical or logical indexes doesn't change anything here. I agree that we don't do that often in real applications, but I actually use that quite a lot for my own debugging :) I actually don't see why people would like to use physical numbers in such a hierarchical notation since physical socket/core numbers are often strange/illogical and nobody remembers them. However, I agree that the physical indexes are useful when *not* using a hierarchical notation, ie I want to bind on thread OS index #46. Brice
Re: [hwloc-devel] hwloc-bind syntax
On Thu, 2009-12-03 at 20:32 -0500, Jeff Squyres wrote: > > > Ah, ok. To be clear, is it accurate to say that it is one of the > > > following forms: > > > > > > - a hex number (without leading "0x" -- would "0x" be ignored if it is > > > supplied?) > > > > We never used 0x there. > > Ok. > > It might be good to safely ignore 0x if it's present, but that's a small > feature enhancement that can be done at any time (I filed a future ticket). Maybe not relevant but it bit me so I'll say it here, using "%x" with sscanf on a string of "0x1" will match the whole thing and give a value of 1 on Linux but on Solaris it'll match the "0" as a hex value of 0 and not match the "x1" at all leading to further errors in subsequent matches as well. The most annoying thing is that sscanf() thinks it's matched and it's return code will be set accordingly. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] hwloc-bind syntax
On Dec 3, 2009, at 12:26 PM, Brice Goglin wrote: > > (shouldn't that say hwloc-bind, not topobind?) > > Right :) Easily fixed -- just done. :-) > > That would seem useful (slightly shorter than "proc:0.proc:1.proc:4"). I > > can file a feature request if it's not already supported. > > Actually, it would proc:0 proc:1 proc:4 (space separated). > hwloc-bind/mask do a logical/cpuset OR of all objects/masks given on the > command-line. Ah -- I see from your explanation below that foo.bar.baz is different than foo bar baz. I haven't looked at the argv parsing -- does it just strcmp each of the argv's and look for a recognized prefix, and if so, assume that it is a specification? If it doesn't find a recognized prefix, it assumes that it's the first argv of the tokens to exec (and therefore stop examining argv)? FWIW, this is pretty much what mpirun does. Is "--" recognized, too? (I'm now asking for more detail because I intend to document this stuff properly ;-) ) > > 2. What does it mean to "hwloc-bind core:0 ..."? (I asked Samuel this in > > IM as well, but I didn't understand his answer). *Which* "core 0" does > > that refer to? For example, an abbreviated version of my lstopo output is > > as follows (it's a pre-production EX machine -- I can't share all the > > details -- I 'x'ed out some of the numerical values): > > > > - > > System(xxxGB) > > Node#0(xxxGB) + Socket#0 + L3(xxxMB) > > L2(xxxKB) + L1(xxxKB) + Core#0 + P#0 > > ... > > Node#1(xxxGB) + Socket#2 + L3(xxxMB) > > L2(xxxKB) + L1(xxxKB) + Core#0 + P#1 > > ... > > - > > > > The processors have unique numbers, but the cores do not. Is that a bug? > > These are physical/OS indexes, not logical indexes. > > hwloc-bind/mask takes logical indexes, no it has nothing to do with the > above #N. core:1 means "the second Core object" when you the above > output from top to bottom. Hmm. That's very confusing. FWIW: we went round and round (and round and round and round and ...) in deciding whether to use physical/OS indexing or logical indexing in Open MPI. We finally decided that users only care about logical indexing -- we hid all physical/OS indexing values under the covers. Hwloc, obviously, is a bit different. More below. > > 3. What is the difference between "system" and "machine"? > > Machine is a physical machine. System may be be different in case of > Single System Image like Kerrighed, vSMP, ... (only Kerrighed is > supported so far). Do we have good descriptions for each of the scope names that can be put in the docs? hwloc-mask shows the following names: system, machine, node, socket, core, proc[essor] Has anyone contacted Penguin and/or XHPC (and/or any other SSI projects) to see if they care about being supported by hwloc? --> This is a good point to support my dynamic SSO plugin idea. ;-) > > 4. What exactly does "index" refer to -- is it a virtual index (e.g., > > hwloc's numbering of 0-N) or is it the OS's index? I thought we used OS > > index numbering, but #2 confuses me -- if #2 is just a bug, then perhaps > > this question is moot. :-) > > We use virtual/logical/OS index everywhere, except in the lstopo output > and in the functions that contain os_index in their prototype. Hmm - I can't parse that. You seem to be equating logical == virtual == OS indexing in that statement, but you distinctly called OS and logical indexing different in text higher up in this reply... Regardless, I find this confusing -- I'm quite sure that newbies will also find it confusing. All of hwloc should default to one form of indexing (regardless of whether it's physical/OS or some form of logical/hwloc-imposed indexing) -- and/or be explicit about which kind of indexing is used in every case. To be clear: it's strange to me that you can't use the numbers in the output from lstopo as arguments to hwloc-bind. I think that this will be quite a common / useful usage pattern: look up your machine's topology with lstopo and then hwloc-bind a command to something that you see in the lstopo output. At a minimum, I would think that all the CLI commands should default to the same kind of indexing to prevent confusion. Perhaps hwloc CLI tools should be able to show/accept *both* kinds of indexing...? E.g.: lstopo --physical lstopo --logical hwloc-bind --physical ... hwloc-bind --logical ... > > 5. What exactly is a "cpuset string"? Can some examples be provided? > > It's 0 for nothing, for 32procs, 11 for the the first > and the 257th processors. It's a comma separated list of 32bits bitmak. Ah, ok. To be clear, is it accurate to say that it is one of the following forms: - a hex number (without leading "0x" -- would "0x" be ignored if it is supplied?) - a comma-delimited set of 32bit bitmasks where MSB 0's do not have to be listed > > --> Sidenote: I actually find hwloc's use of the word "cpuset" to be quite > > confusing
Re: [hwloc-devel] hwloc-bind syntax
Jeff Squyres wrote: > I was trying to use hwloc-bind this morning, and I was a bit confused by the > syntax. I see that the help message says: > > - > Usage: topobind [options] -- command ... > may be a space-separated list of cpusets or objects > as supported by the hwloc-mask utility. > - > > (shouldn't that say hwloc-bind, not topobind?) > Right :) > I assume the here in hwloc-mask is the same as the in > hwloc-bind. > Yes. > 1. Is the index syntax "X,Y[,Z[...]]" supported? I don't see it on the list, > but was curious if it is supported anyway. E.g., "proc:0,1,4". No I don't think it's supported right now. > That would seem useful (slightly shorter than "proc:0.proc:1.proc:4"). I > can file a feature request if it's not already supported. > Actually, it would proc:0 proc:1 proc:4 (space separated). hwloc-bind/mask do a logical/cpuset OR of all objects/masks given on the command-line. > 2. What does it mean to "hwloc-bind core:0 ..."? (I asked Samuel this in IM > as well, but I didn't understand his answer). *Which* "core 0" does that > refer to? For example, an abbreviated version of my lstopo output is as > follows (it's a pre-production EX machine -- I can't share all the details -- > I 'x'ed out some of the numerical values): > > - > System(xxxGB) > Node#0(xxxGB) + Socket#0 + L3(xxxMB) > L2(xxxKB) + L1(xxxKB) + Core#0 + P#0 > ... > Node#1(xxxGB) + Socket#2 + L3(xxxMB) > L2(xxxKB) + L1(xxxKB) + Core#0 + P#1 > ... > - > > The processors have unique numbers, but the cores do not. Is that a bug? > These are physical/OS indexes, not logical indexes. hwloc-bind/mask takes logical indexes, no it has nothing to do with the above #N. core:1 means "the second Core object" when you the above output from top to bottom. > 3. What is the difference between "system" and "machine"? > Machine is a physical machine. System may be be different in case of Single System Image like Kerrighed, vSMP, ... (only Kerrighed is supported so far). > 4. What exactly does "index" refer to -- is it a virtual index (e.g., hwloc's > numbering of 0-N) or is it the OS's index? I thought we used OS index > numbering, but #2 confuses me -- if #2 is just a bug, then perhaps this > question is moot. :-) > We use virtual/logical/OS index everywhere, except in the lstopo output and in the functions that contain os_index in their prototype. > 5. What exactly is a "cpuset string"? Can some examples be provided? > It's 0 for nothing, for 32procs, 11 for the the first and the 257th processors. It's a comma separated list of 32bits bitmak. > --> Sidenote: I actually find hwloc's use of the word "cpuset" to be quite > confusing because it is *NOT* the same as an OS cpuset. The structure might be a bit different, but it is conceptually the same than the OS cpuset. When bit N is set in a hwloc cpuset, it means we are talking about the processor whose *OS-index* is N. > 6. "several may be concatenated with `.'..." Does that mean > that this is legal: > > core:0.node:2.system:4 > > If so, what exactly does it mean when they overlap? Is it simply the union > of those 3 specifications? It means 5th logical system below 3rd logical node below first core. So it means nothing when there are no node objects below cores or no systems below nodes. > Also, I'm curious -- why was a period chosen as the delimiter instead of a > comma? Is this a Europe-vs-US thing? (i.e., in the US, we typically use > commas for lists -- is it different in Europe?) > We use commas for lists in Europe too. But The above is not a list, it's a inclusion. See it as core[0].node[2].system[4] in C language. Brice