Re: [hwloc-devel] tarball growing
On 28 Sep 2010, at 07:27, Brice Goglin wrote: > The bz2 tarball of hwloc 1.0.2 was 2.1MB. hwloc 1.1 will be at least > 2.7MB. I know that bandwidth is free, but I am still not confortable > with the size increasing that much. > > Any other idea? There is probably some mileage in simply unziping the tarballs in the tests directory. This will increase the size of the source tree but probably decrease the size of the download. Other ideas would be ensuring that the download is compressed with --best or using bzip2 As an aside having files names *.tar.gz.output violates the principal of least surprise, it's not clear to a casual user what these are? Ashley. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] [hwloc] #12: support user-defined processor restriction
On 15 Feb 2010, at 21:46, Samuel Thibault wrote: >> I'm not sure I follow ticket #12 but I suspect it's the same thing as #21. > > No, it's not, really :) > > #21 was never meant to limit the discovery to the current binding > (understand sched_setaffinity) of some process, it was just meant > to discover according to what another process would see, e.g. with > administrative constraints (understand Linux cpuset). > >> I say the commit r1726 which closed #21 and am working on testing this now, >> it certainly appears to be what I requested. > > Maybe, depending on whether you want to discover according to the other > process' binding (sched_setaffinity) or according to the other process' > restricted view of the machine (Linux cpuset). I don't understand the difference, I thought they were two ways of achieving the same thing? -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] [hwloc] #12: support user-defined processor restriction
On 15 Feb 2010, at 21:29, Samuel Thibault wrote: > Brice Goglin, le Mon 15 Feb 2010 22:19:33 +0100, a écrit : >> hwloc wrote: >>> - Add a configuration flag to limit the discovery to the current binding >>> of the process >>> Do we still want this for 1.0? >> >> I think Ashley wanted the first item when he requested lstopo --pid >> but I may be wrong. > > You mean in addition to the resolution of #21? Maybe indeed. I'm not sure I follow ticket #12 but I suspect it's the same thing as #21. I say the commit r1726 which closed #21 and am working on testing this now, it certainly appears to be what I requested. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] processor restriction + lookup of pid for 1.0
On 30 Jan 2010, at 14:57, Samuel Thibault wrote: > Samuel Thibault, le Sat 30 Jan 2010 15:55:00 +0100, a écrit : >> #21 implicitly does: "what cpuset they're bound to" is just an example. >> A configuration function hwloc_topology_set_pid(topology, pid) would >> mean that the discovery has to be done from the view of the given pid, >> and thus the allowed_cpuset should be according to that view, thus >> administrative restrictions. > > Just to give an example: lstopo --pid 1234 would not only show where the > process is currently bound to, but also its allowed cpuset, which can be > useful when monitoring applications run by a batch scheduler or such. Hi, It was my request that caused Jeff to file that enhancement request. My take on this would be that #21 should be interpreted as 'report system state from the point of view of rather than self'. I.e. I don't care which cpuset is shown, the current or the allowed, all I care about is changing the frame of reference so the view is what you would see if the same code was being called from . The reason for this is it's currently possible to do "mpirun lstopo" to see where processes will be bound but it's not possible using lstopo to see the binding of already running jobs. As some of you will be aware I maintain padb, a 'job inspection' tool and I believe lstopo and padb could work together to present a parallel, job-wide view of process binding across a parallel job. I've already added the code to padb to wrap around lstopo, it's available from SVN and has been for some time, it currently runs lstopo for every process within a job on the correct node with the --whole-system option, this means the output is not particuarly relevant though - hence the change request. If you are experimenting with this then the following padb command will allow you to play with the command line options provided, %p will be expanded to the pid. I'm curious to see how this pans out in actual use but I believe it's got potential to be very useful indeed. $ padb --lstopo -Olstopo-show-warning=no -Olstopo-command="lstopo --pid=%p -" -c [ -a | ] I'm aiming to make a padb release in the next month with a being as RC as soon as two weeks away, if I can change the default "lstopo-command" to one that takes a pid before then that would be great, if not padb is future-proof as users can over-ride the default in a configuration file but this raises the barrier somewhat as people would need to be aware that this was an option. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] Disabling X component
On Fri, 2009-12-04 at 13:04 +0100, Samuel Thibault wrote: > Ashley Pittman, le Fri 04 Dec 2009 11:06:12 +, a écrit : > > The debian version of -.txt (lstopo 0.9.3rc1) leaves my terminal with > > the colours inverted after I call it, I have to do a reset to get back > > to black on grey background. > > Uh, odd. Which terminal are you using? gnome-terminal with $TERM set to xterm. I've not done anything special with this, it's just a debian unstable install. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
[hwloc-devel] Disabling X component
I installed the debian package of hwloc yesterday and discovered that the default action of lstopo is to display a window with a picture in. I guess I don't have the right development packages installed for this to be enabled in my local build. In my tool I want to ensure the text version is displayed, padb popping up a number of windows isn't what people will expect or want. Obviously I can unset DISPLAY before calling lstopo but a --no-x or --text-based option would be a nice thing to have as well. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] hwloc-bind syntax
On Thu, 2009-12-03 at 20:32 -0500, Jeff Squyres wrote: > > > Ah, ok. To be clear, is it accurate to say that it is one of the > > > following forms: > > > > > > - a hex number (without leading "0x" -- would "0x" be ignored if it is > > > supplied?) > > > > We never used 0x there. > > Ok. > > It might be good to safely ignore 0x if it's present, but that's a small > feature enhancement that can be done at any time (I filed a future ticket). Maybe not relevant but it bit me so I'll say it here, using "%x" with sscanf on a string of "0x1" will match the whole thing and give a value of 1 on Linux but on Solaris it'll match the "0" as a hex value of 0 and not match the "x1" at all leading to further errors in subsequent matches as well. The most annoying thing is that sscanf() thinks it's matched and it's return code will be set accordingly. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] hwloc at SC09
On Thu, 2009-11-12 at 20:30 +0100, Samuel Thibault wrote: > Ashley Pittman, le Thu 12 Nov 2009 19:11:11 +, a écrit : > > I just tried to run it on my arm but failed as I only have autoconf > > 2.61, > > You can run ./configure && make dist on another machine to get an > autoconfied tarball ready to unpack/conf/make. That didn't go so well actually, this is on a NSLU2 arm machine with 32Mb of ram running debian etch so hardly your target market! I'm leaving for the airport in an hour or two so don't have time to do any investigation into the cause today. Ashley, cam:~/hw> /tmp/bin/lstopo Segmentation fault cam:~/hw> gdb /tmp/bin/lstopo GNU gdb 6.4.90-debian Copyright (C) 2006 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "arm-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1". (gdb) run Starting program: /tmp/bin/lstopo Program received signal SIGSEGV, Segmentation fault. hwloc_obj_snprintf (string=0xbedd4b48 "\001", size=256, topology=, l=0x0, _indexprefix=0x4009bdd8 "#", verbose=1) at traversal.c:177 177 hwloc_obj_type_t type = l->type; (gdb) bt #0 hwloc_obj_snprintf (string=0xbedd4b48 "\001", size=256, topology=, l=0x0, _indexprefix=0x4009bdd8 "#", verbose=1) at traversal.c:177 #1 0x40095924 in print_objects (topology=0x19050, indent=0, obj=0x0) at topology.c:373 #2 0x40097490 in hwloc_topology_load (topology=0x4001ce14) at topology.c:1009 #3 0x93e0 in main (argc=1, argv=0x1) at lstopo.c:206 (gdb) p l $1 = (struct hwloc_obj *) 0x0 (gdb) -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] hwloc at SC09
On Thu, 2009-11-12 at 19:47 +0100, Brice Goglin wrote: > FWIW, I'll be at SC09 next week, mostly on INRIA Booth #1405. I'll > present hwloc (among other software we develop here). I'll swing round and say hi. > I'll have a USB key with lstopo static binaries (from the libpci branch) > so that people can run it on their machine immediately. I hope somebody > will show up with a 192-core laptop, otherwise the lstopo output will > often be very simple :) I just tried to run it on my arm but failed as I only have autoconf 2.61, in fact it fails in the same way on my cloud cluster as well which must be a new thing as I'm sure I ran it there last week. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [hwloc-devel] [hwloc] #21: Allow lookup of specific PIDs
On Thu, 2009-10-22 at 16:15 +0200, Brice Goglin wrote: > > I've added the code to padb to run this against jobs, you can now do > > "padb -a --lstopo -c" to see information about hosts where your jobs are > > running. > > > > http://code.google.com/p/padb/source/detail?r=297 > > > > I thought you wanted the topology of the whole machine, not only the > current cpuset? If so, you want to add --whole-system to the lstopo > command-line. That's exactly what the code I've just committed does. Padb targets existing jobs but is also a parallel job itself, what would be best would be if the padb job could report the topology for the existing job by either supplying the pid or cpuset to lstopo. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk