I've reproduced the problem in a small MPI + OpenMP code.

The error is the same: after some memory bind, gives "Cannot allocate
memory".

Thanks.

2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it>

> Downscaling the matrix size, binding works well, but the memory available
> is enought also using more big matrix, so I'm a bit confused.
>
> Using the same big matrix size without binding the code works well, so how
> I can explain this behaviour?
>
> Maybe hwloc_set_area_membind_nodeset introduces other extra allocation
> that are resilient after the call?
>
>
>
> 2012/9/5 Brice Goglin <brice.gog...@inria.fr>
>
>>  An internal malloc failed then. That would explain why your malloc
>> failed too.
>> It looks like you malloc'ed too much memory in your program?
>>
>> Brice
>>
>>
>>
>>
>> Le 05/09/2012 15:56, Gabriele Fatigati a écrit :
>>
>> An update:
>>
>>  placing strerror(errno) after hwloc_set_area_membind_nodeset  gives:
>> "Cannot allocate memory"
>>
>> 2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it>
>>
>>> Hi,
>>>
>>>  I've noted that hwloc_set_area_membind_nodeset return -1 but errno is
>>> not equal to EXDEV or ENOSYS. I supposed that these two case was the two
>>> unique possibly.
>>>
>>>  From the hwloc documentation:
>>>
>>>  -1 with errno set to ENOSYS if the action is not supported
>>> -1 with errno set to EXDEV if the binding cannot be enforced
>>>
>>>
>>>  Any other binding failure reason? The memory available is enought.
>>>
>>> 2012/9/5 Brice Goglin <brice.gog...@inria.fr>
>>>
>>>>  Hello Gabriele,
>>>>
>>>> The only limit that I would think of is the available physical memory
>>>> on each NUMA node (numactl -H will tell you how much of each NUMA node
>>>> memory is still available).
>>>> malloc usually only fails (it returns NULL?) when there no *virtual*
>>>> memory anymore, that's different. If you don't allocate tons of terabytes
>>>> of virtual memory, this shouldn't happen easily.
>>>>
>>>> Brice
>>>>
>>>>
>>>>
>>>>
>>>> Le 05/09/2012 14:27, Gabriele Fatigati a écrit :
>>>>
>>>>  Dear Hwloc users and developers,
>>>>
>>>>
>>>>  I'm using hwloc 1.4.1 on a multithreaded program in a Linux platform,
>>>> where each thread bind many non contiguos pieces of a big matrix using in a
>>>> very intensive way hwloc_set_area_membind_nodeset function:
>>>>
>>>>  hwloc_set_area_membind_nodeset(topology, punt+offset, len, nodeset,
>>>> HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD | HWLOC_MEMBIND_MIGRATE);
>>>>
>>>>  Binding seems works well, since the returned code from function is 0
>>>> for every calls.
>>>>
>>>>  The problems is that after binding, a simple little new malloc fails,
>>>> without any apparent reason.
>>>>
>>>>  Disabling memory binding, the allocations works well.  Is there any
>>>> knows problem if  hwloc_set_area_membind_nodeset is used intensively?
>>>>
>>>>  Is there some operating system limit for memory pages binding?
>>>>
>>>>  Thanks in advance.
>>>>
>>>>  --
>>>> Ing. Gabriele Fatigati
>>>>
>>>> HPC specialist
>>>>
>>>> SuperComputing Applications and Innovation Department
>>>>
>>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>>
>>>> www.cineca.it                    Tel:   +39 051 
>>>> 6171722<%2B39%20051%206171722>
>>>>
>>>> g.fatigati [AT] cineca.it
>>>>
>>>>
>>>>  _______________________________________________
>>>> hwloc-users mailing 
>>>> listhwloc-users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>>
>>>>
>>>>
>>>
>>>
>>>  --
>>> Ing. Gabriele Fatigati
>>>
>>> HPC specialist
>>>
>>> SuperComputing Applications and Innovation Department
>>>
>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>
>>> www.cineca.it                    Tel:   +39 051 
>>> 6171722<%2B39%20051%206171722>
>>>
>>> g.fatigati [AT] cineca.it
>>>
>>
>>
>>
>>  --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it                    Tel:   +39 051 6171722
>>
>> g.fatigati [AT] cineca.it
>>
>>
>>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it                    Tel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.it                    Tel:   +39 051 6171722

g.fatigati [AT] cineca.it
#include <mpi.h>
#include <stdio.h>
#include <numa.h>
#include <hwloc.h>


#define PAGE_SIZE 4096

int main(int argc,char *argv[]){


	/* Bind memory example: each thread bind a piece of allocated memory in local node
	 */

	MPI_Init (&argc, &argv);
	int rank;
	int result;

	MPI_Comm_rank (MPI_COMM_WORLD, &rank);

        hwloc_topology_t topology;
        hwloc_cpuset_t cpuset;
	hwloc_obj_t obj;
	hwloc_topology_init(&topology);
	hwloc_topology_load(topology);

	size_t i;

	// allocate 8 GB
	size_t len=8192000000;

	long free_mem = 0;

	numa_node_size(0,&free_mem);
	printf("free memory node 0: %li \n", free_mem);
	numa_node_size(1,&free_mem);
	printf("free memory node 1: %li \n", free_mem);

	char* array;
	array=(char*)malloc(len);

        if(array==NULL) {
		printf( " Error allocation memory \n");
                return -1;
	}

#pragma omp parallel num_threads(2)
   {
          
        size_t chunk = len/omp_get_num_threads();
       	int tid = omp_get_thread_num();
	int my_pu_id, my_node_id;
	int res;
        size_t i;

        hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
        hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
        hwloc_bitmap_singlify(cpuset);
        hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
    
        for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
           res = hwloc_set_area_membind_nodeset(topology, &array[i], PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
           if(res<0) {
	      printf( " ERRORE: %s \n", strerror(errno));
              break;
           }

       }

  }

	numa_node_size(0,&free_mem);
	printf("free memory node 0: %li \n", free_mem);
	numa_node_size(1,&free_mem);
	printf("free memory node 1: %li \n", free_mem);


	free(array);

	MPI_Finalize();


	return 0;

}

Reply via email to