Le 06/09/2012 12:19, Gabriele Fatigati a écrit :
> I did't find any strange number in /proc/meminfo.
>
> I've noted that the program fails exactly
> every 65479 hwloc_set_area_membind. So It sounds like some kernel
> limit. You can check that also just one thread.
>
> Maybe never has not noted them  because usually we bind a large amount
> of contiguos memory few times, instead of small and non contiguos
> pieces of memory many and many times.. :(

If you have root access, try (as root)
    watch -n 1 grep numa_policy /proc/slabinfo
Put a sleep(10) in your program when set_area_membind() fails, and don't
let your program exit before you can read the content of /proc/slabinfo.

Brice



>
> 2012/9/6 Brice Goglin <brice.gog...@inria.fr
> <mailto:brice.gog...@inria.fr>>
>
>     Le 06/09/2012 10:44, Samuel Thibault a écrit :
>     > Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
>     >> mbind hwloc_linux_set_area_membind()  fails:
>     >>
>     >> Error from HWLOC mbind: Cannot allocate memory
>     > Ok. mbind is not really supposed to allocate much memory, but it
>     still
>     > does allocate some, to record the policy
>     >
>     >> //        hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
>     HWLOC_OBJ_NODE, tid);
>     >>         hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
>     HWLOC_OBJ_PU, tid);
>     >>         hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
>     >>         hwloc_bitmap_singlify(cpuset);
>     >>         hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
>     >>
>     >>         for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
>     >> //           res = hwloc_set_area_membind_nodeset(topology,
>     &array[i], PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND,
>     HWLOC_MEMBIND_THREAD);
>     >>              res = hwloc_set_area_membind(topology, &array[i],
>     PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
>     > and I'm afraid that calling set_area_membind for each page might
>     be too
>     > dense: the kernel is probably allocating a memory policy record
>     for each
>     > page, not being able to merge adjacent equal policies.
>     >
>
>     It's supposed to merge VMA with same policies (from what I
>     understand in
>     the code), but I don't know if that actually works.
>     Maybe Gabriele found a kernel bug :)
>
>     Brice
>
>     _______________________________________________
>     hwloc-users mailing list
>     hwloc-us...@open-mpi.org <mailto:hwloc-us...@open-mpi.org>
>     http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it <http://www.cineca.it>                    Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it <http://cineca.it>          
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

Reply via email to