The machine is back in working order.
I tried this patch and it works great: I get cpus and my whole program runs
as expected.
I'm now looking into what failed in look_sysfscpu.

On Sun, Mar 25, 2012 at 2:43 AM, Brice Goglin <brice.gog...@inria.fr> wrote:

> Le 24/03/2012 23:04, Daniel Ibanez a écrit :
> > The fundamental difference is in
> >
> > src/topology-linux.c:3251
> >
> > when this if statement is true, hwloc_setup_pu_level
> > finds the PU objects.
> > When it is false, it fails with empty topology.
> >
> > I checked HWLOC_LINUX_USE_CPUINFO,
> > and it is not detected even when I set it from the front end.
> >
> > That means the difference is whether hwloc can access
> > the various /sys/devices and /sys/bus files.
> >
> > Additional printfs confirm that with MPI in the code,
> > hwloc_accessat succeeds on the various /sys/ directories,
> > but the overall procedure for getting PUs from these fails.
> > Without MPI, access to /sys/ directories fails but
> > the fallback hwloc_setup_pu_level works.
>
> If I understand correctly, in the MPI case, look_sysfscpu() ends up
> being called. There are two instances of it because of a possible
> renaming of /sys/devices/system/cpu in the future, so it's expected that
> the one succeeds and the other fails. Can you check whether both fail ?
> Or just try the attached patch which adds a fallback for this case.
>
> But it'd be good to understand what's going on in /sys on this machine.
> And I still don't understand why MPI changes things here.
>
> Brice
>
> --- src/topology-linux.c        (révision 4420)
> +++ src/topology-linux.c        (copie de travail)
> @@ -3270,7 +3270,15 @@
>       if (numprocs <= 0)
>        Lprocs = NULL;
>       if (look_sysfscpu(topology, "/sys/bus/cpu/devices", Lprocs,
> numprocs) < 0)
> -        look_sysfscpu(topology, "/sys/devices/system/cpu", Lprocs,
> numprocs);
> +        if (look_sysfscpu(topology, "/sys/devices/system/cpu", Lprocs,
> numprocs) < 0) {
> +         /* sysfs but we failed to read cpu topology, fallback */
> +          if (topology->is_thissystem)
> +            hwloc_setup_pu_level(topology,
> hwloc_fallback_nbprocessors(topology));
> +          else
> +            /* fsys-root but not this system, no way, assume there's just
> 1
> +             * processor :/ */
> +            hwloc_setup_pu_level(topology, 1);
> +        }
>       if (Lprocs)
>        hwloc_linux_free_cpuinfo(Lprocs, numprocs);
>     }
>
>
>


-- 

Dan Ibanez

Reply via email to