The machine is back in working order. I tried this patch and it works great: I get cpus and my whole program runs as expected. I'm now looking into what failed in look_sysfscpu.
On Sun, Mar 25, 2012 at 2:43 AM, Brice Goglin <brice.gog...@inria.fr> wrote: > Le 24/03/2012 23:04, Daniel Ibanez a écrit : > > The fundamental difference is in > > > > src/topology-linux.c:3251 > > > > when this if statement is true, hwloc_setup_pu_level > > finds the PU objects. > > When it is false, it fails with empty topology. > > > > I checked HWLOC_LINUX_USE_CPUINFO, > > and it is not detected even when I set it from the front end. > > > > That means the difference is whether hwloc can access > > the various /sys/devices and /sys/bus files. > > > > Additional printfs confirm that with MPI in the code, > > hwloc_accessat succeeds on the various /sys/ directories, > > but the overall procedure for getting PUs from these fails. > > Without MPI, access to /sys/ directories fails but > > the fallback hwloc_setup_pu_level works. > > If I understand correctly, in the MPI case, look_sysfscpu() ends up > being called. There are two instances of it because of a possible > renaming of /sys/devices/system/cpu in the future, so it's expected that > the one succeeds and the other fails. Can you check whether both fail ? > Or just try the attached patch which adds a fallback for this case. > > But it'd be good to understand what's going on in /sys on this machine. > And I still don't understand why MPI changes things here. > > Brice > > --- src/topology-linux.c (révision 4420) > +++ src/topology-linux.c (copie de travail) > @@ -3270,7 +3270,15 @@ > if (numprocs <= 0) > Lprocs = NULL; > if (look_sysfscpu(topology, "/sys/bus/cpu/devices", Lprocs, > numprocs) < 0) > - look_sysfscpu(topology, "/sys/devices/system/cpu", Lprocs, > numprocs); > + if (look_sysfscpu(topology, "/sys/devices/system/cpu", Lprocs, > numprocs) < 0) { > + /* sysfs but we failed to read cpu topology, fallback */ > + if (topology->is_thissystem) > + hwloc_setup_pu_level(topology, > hwloc_fallback_nbprocessors(topology)); > + else > + /* fsys-root but not this system, no way, assume there's just > 1 > + * processor :/ */ > + hwloc_setup_pu_level(topology, 1); > + } > if (Lprocs) > hwloc_linux_free_cpuinfo(Lprocs, numprocs); > } > > > -- Dan Ibanez