Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Gabriele Fatigati
Hi Brice, hi Jeff,

>Can you add some printf inside hwloc_linux_set_area_membind() in
src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or not?

I added printf inside that function, but ENOMEM does not come from there.

>Have you run your application through valgrind or another memory-checking
debugger?

I tried with valgrind :

valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full
--tool=memcheck  --show-reachable=yes ./main_hybrid_bind_mem

==25687== Warning: set address range perms: large range [0x39454040,
0x2218d4040) (undefined)
==25687==
==25687== Valgrind's memory management: out of memory:
==25687==newSuperblock's request for 4194304 bytes failed.
==25687==34253180928 bytes have already been allocated.
==25687== Valgrind cannot continue.  Sorry.


I attach the full output.


The code dies also using OpenMP pure code. Very misteriously.


2012/9/5 Jeff Squyres 

> On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:
>
> > I don't think is a simply out of memory since NUMA node has 48 GB, and
> I'm allocating just 8 GB.
>
> Mmm.  Probably right.
>
> Have you run your application through valgrind or another memory-checking
> debugger?
>
> I've seen cases of heap corruption lead to malloc incorrectly failing with
> ENOMEM.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


output_valgrind
Description: Binary data


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
Le 06/09/2012 09:56, Gabriele Fatigati a écrit :
> Hi Brice, hi Jeff,
>
> >Can you add some printf inside hwloc_linux_set_area_membind() in
> src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or
> not?
>
> I added printf inside that function, but ENOMEM does not come from there.

Not from hwloc_linux_set_area_membind() at all? Or not from mbind?

> >Have you run your application through valgrind or another
> memory-checking debugger?
>
> I tried with valgrind :
>
> valgrind --track-origins=yes --log-file=output_valgrind
> --leak-check=full --tool=memcheck  --show-reachable=yes
> ./main_hybrid_bind_mem
>
> ==25687== Warning: set address range perms: large range [0x39454040,
> 0x2218d4040) (undefined)
> ==25687== 
> ==25687== Valgrind's memory management: out of memory:
> ==25687==newSuperblock's request for 4194304 bytes failed.
> ==25687==34253180928 bytes have already been allocated.
> ==25687== Valgrind cannot continue.  Sorry.

There's really somebody allocating way too much memory here.

You should reduce your array size so that it doesn't fail, and then run
valgrind again to check if somebody is allocated a lot of memory without
ever freeing it.

Brice



>
>
> I attach the full output. 
>
>
> The code dies also using OpenMP pure code. Very misteriously.
>
>
> 2012/9/5 Jeff Squyres mailto:jsquy...@cisco.com>>
>
> On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:
>
> > I don't think is a simply out of memory since NUMA node has 48
> GB, and I'm allocating just 8 GB.
>
> Mmm.  Probably right.
>
> Have you run your application through valgrind or another
> memory-checking debugger?
>
> I've seen cases of heap corruption lead to malloc incorrectly
> failing with ENOMEM.
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org 
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it   
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users



Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Gabriele Fatigati
Sorry,

I used a wrong hwloc installation. Using the hwloc with the printf controls:

mbind hwloc_linux_set_area_membind()  fails:

Error from HWLOC mbind: Cannot allocate memory

so this is the origin of bad allocation.

I attach the right valgrind output

valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full
--tool=memcheck  --show-reachable=yes ./main_hybrid_bind_mem





2012/9/6 Gabriele Fatigati 

> Hi Brice, hi Jeff,
>
> >Can you add some printf inside hwloc_linux_set_area_membind() in
> src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or not?
>
> I added printf inside that function, but ENOMEM does not come from there.
>
> >Have you run your application through valgrind or another memory-checking
> debugger?
>
> I tried with valgrind :
>
> valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full
> --tool=memcheck  --show-reachable=yes ./main_hybrid_bind_mem
>
> ==25687== Warning: set address range perms: large range [0x39454040,
> 0x2218d4040) (undefined)
> ==25687==
> ==25687== Valgrind's memory management: out of memory:
> ==25687==newSuperblock's request for 4194304 bytes failed.
> ==25687==34253180928 bytes have already been allocated.
> ==25687== Valgrind cannot continue.  Sorry.
>
>
> I attach the full output.
>
>
> The code dies also using OpenMP pure code. Very misteriously.
>
>
>
> 2012/9/5 Jeff Squyres 
>
>> On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:
>>
>> > I don't think is a simply out of memory since NUMA node has 48 GB, and
>> I'm allocating just 8 GB.
>>
>> Mmm.  Probably right.
>>
>> Have you run your application through valgrind or another memory-checking
>> debugger?
>>
>> I've seen cases of heap corruption lead to malloc incorrectly failing
>> with ENOMEM.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.itTel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


output_valgrind
Description: Binary data


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Gabriele Fatigati
Downsizing the array, up to 4GB,

valgrind gives many warnings reported in the attached file.









2012/9/6 Gabriele Fatigati 

> Sorry,
>
> I used a wrong hwloc installation. Using the hwloc with the printf
> controls:
>
> mbind hwloc_linux_set_area_membind()  fails:
>
> Error from HWLOC mbind: Cannot allocate memory
>
> so this is the origin of bad allocation.
>
> I attach the right valgrind output
>
> valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full
> --tool=memcheck  --show-reachable=yes ./main_hybrid_bind_mem
>
>
>
>
>
> 2012/9/6 Gabriele Fatigati 
>
>> Hi Brice, hi Jeff,
>>
>> >Can you add some printf inside hwloc_linux_set_area_membind() in
>> src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or not?
>>
>> I added printf inside that function, but ENOMEM does not come from there.
>>
>> >Have you run your application through valgrind or another
>> memory-checking debugger?
>>
>> I tried with valgrind :
>>
>> valgrind --track-origins=yes --log-file=output_valgrind --leak-check=full
>> --tool=memcheck  --show-reachable=yes ./main_hybrid_bind_mem
>>
>> ==25687== Warning: set address range perms: large range [0x39454040,
>> 0x2218d4040) (undefined)
>> ==25687==
>> ==25687== Valgrind's memory management: out of memory:
>> ==25687==newSuperblock's request for 4194304 bytes failed.
>> ==25687==34253180928 bytes have already been allocated.
>> ==25687== Valgrind cannot continue.  Sorry.
>>
>>
>> I attach the full output.
>>
>>
>> The code dies also using OpenMP pure code. Very misteriously.
>>
>>
>>
>> 2012/9/5 Jeff Squyres 
>>
>>> On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:
>>>
>>> > I don't think is a simply out of memory since NUMA node has 48 GB, and
>>> I'm allocating just 8 GB.
>>>
>>> Mmm.  Probably right.
>>>
>>> Have you run your application through valgrind or another
>>> memory-checking debugger?
>>>
>>> I've seen cases of heap corruption lead to malloc incorrectly failing
>>> with ENOMEM.
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to:
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>>
>>
>>
>>
>> --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.itTel:   +39 051 6171722
>>
>> g.fatigati [AT] cineca.it
>>
>
>
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.itTel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


output_valgrind
Description: Binary data


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
Le 06/09/2012 10:13, Gabriele Fatigati a écrit :
> Downsizing the array, up to 4GB, 
>
> valgrind gives many warnings reported in the attached file.

Adding hwloc_topology_destroy() at the end of the file would likely
remove most of them.

But that won't fix the problem since the leaks are small.
==28082== LEAK SUMMARY:
==28082==definitely lost: 4,080 bytes in 3 blocks
==28082==indirectly lost: 51,708 bytes in 973 blocks
==28082==  possibly lost: 304 bytes in 1 blocks
==28082==still reachable: 1,786 bytes in 4 blocks
==28082== suppressed: 0 bytes in 0 blocks

I don't know where to look, sorry.

Brice



>
>
>
>
>
>
>
>
>
> 2012/9/6 Gabriele Fatigati  >
>
> Sorry,
>
> I used a wrong hwloc installation. Using the hwloc with the printf
> controls:
>
> mbind hwloc_linux_set_area_membind()  fails:
>
> Error from HWLOC mbind: Cannot allocate memory 
>
> so this is the origin of bad allocation.
>
> I attach the right valgrind output
>
> valgrind --track-origins=yes --log-file=output_valgrind
> --leak-check=full --tool=memcheck  --show-reachable=yes
> ./main_hybrid_bind_mem
>
>
>
>
>
> 2012/9/6 Gabriele Fatigati  >
>
> Hi Brice, hi Jeff,
>
> >Can you add some printf inside hwloc_linux_set_area_membind()
> in src/topology-linux.c to see if ENOMEM comes from the mbind
> >syscall or not?
>
> I added printf inside that function, but ENOMEM does not come
> from there.
>
> >Have you run your application through valgrind or another
> memory-checking debugger?
>
> I tried with valgrind :
>
> valgrind --track-origins=yes --log-file=output_valgrind
> --leak-check=full --tool=memcheck  --show-reachable=yes
> ./main_hybrid_bind_mem
>
> ==25687== Warning: set address range perms: large range
> [0x39454040, 0x2218d4040) (undefined)
> ==25687== 
> ==25687== Valgrind's memory management: out of memory:
> ==25687==newSuperblock's request for 4194304 bytes failed.
> ==25687==34253180928 bytes have already been allocated.
> ==25687== Valgrind cannot continue.  Sorry.
>
>
> I attach the full output. 
>
>
> The code dies also using OpenMP pure code. Very misteriously.
>
>
>
> 2012/9/5 Jeff Squyres  >
>
> On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:
>
> > I don't think is a simply out of memory since NUMA node
> has 48 GB, and I'm allocating just 8 GB.
>
> Mmm.  Probably right.
>
> Have you run your application through valgrind or another
> memory-checking debugger?
>
> I've seen cases of heap corruption lead to malloc
> incorrectly failing with ENOMEM.
>
> --
> Jeff Squyres
> jsquy...@cisco.com 
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org 
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:  
> +39 051 6171722 
>
> g.fatigati [AT] cineca.it   
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39
> 051 6171722 
>
> g.fatigati [AT] cineca.it   
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it   
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users



Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Gabriele Fatigati
Oops,

I forgot the hwloc_topology_destroy() and also hwloc_bitmap_free(cpuset);

Added them, I attach new code using hwloc_set_area_membind function
directly and new Valgrind output.

2012/9/6 Brice Goglin 

>  Le 06/09/2012 10:13, Gabriele Fatigati a écrit :
>
> Downsizing the array, up to 4GB,
>
>  valgrind gives many warnings reported in the attached file.
>
>
> Adding hwloc_topology_destroy() at the end of the file would likely remove
> most of them.
>
> But that won't fix the problem since the leaks are small.
> ==28082== LEAK SUMMARY:
> ==28082==definitely lost: 4,080 bytes in 3 blocks
> ==28082==indirectly lost: 51,708 bytes in 973 blocks
> ==28082==  possibly lost: 304 bytes in 1 blocks
> ==28082==still reachable: 1,786 bytes in 4 blocks
> ==28082== suppressed: 0 bytes in 0 blocks
>
> I don't know where to look, sorry.
>
> Brice
>
>
>
>
>
>
>
>
>
>
>
>
>
> 2012/9/6 Gabriele Fatigati 
>
>> Sorry,
>>
>>  I used a wrong hwloc installation. Using the hwloc with the printf
>> controls:
>>
>>  mbind hwloc_linux_set_area_membind()  fails:
>>
>>  Error from HWLOC mbind: Cannot allocate memory
>>
>>  so this is the origin of bad allocation.
>>
>>  I attach the right valgrind output
>>
>>  valgrind --track-origins=yes --log-file=output_valgrind
>> --leak-check=full --tool=memcheck  --show-reachable=yes
>> ./main_hybrid_bind_mem
>>
>>
>>
>>
>>
>>   2012/9/6 Gabriele Fatigati 
>>
>>> Hi Brice, hi Jeff,
>>>
>>>  >Can you add some printf inside hwloc_linux_set_area_membind() in
>>> src/topology-linux.c to see if ENOMEM comes from the mbind >syscall or not?
>>>
>>>  I added printf inside that function, but ENOMEM does not come from
>>> there.
>>>
>>>  >Have you run your application through valgrind or another
>>> memory-checking debugger?
>>>
>>>  I tried with valgrind :
>>>
>>>  valgrind --track-origins=yes --log-file=output_valgrind
>>> --leak-check=full --tool=memcheck  --show-reachable=yes
>>> ./main_hybrid_bind_mem
>>>
>>>  ==25687== Warning: set address range perms: large range [0x39454040,
>>> 0x2218d4040) (undefined)
>>> ==25687==
>>> ==25687== Valgrind's memory management: out of memory:
>>>  ==25687==newSuperblock's request for 4194304 bytes failed.
>>> ==25687==34253180928 bytes have already been allocated.
>>> ==25687== Valgrind cannot continue.  Sorry.
>>>
>>>
>>>  I attach the full output.
>>>
>>>
>>>  The code dies also using OpenMP pure code. Very misteriously.
>>>
>>>
>>>
>>> 2012/9/5 Jeff Squyres 
>>>
 On Sep 5, 2012, at 2:36 PM, Gabriele Fatigati wrote:

 > I don't think is a simply out of memory since NUMA node has 48 GB,
 and I'm allocating just 8 GB.

  Mmm.  Probably right.

 Have you run your application through valgrind or another
 memory-checking debugger?

 I've seen cases of heap corruption lead to malloc incorrectly failing
 with ENOMEM.

 --
 Jeff Squyres
 jsquy...@cisco.com
 For corporate legal information go to:
 http://www.cisco.com/web/about/doing_business/legal/cri/


 ___
 hwloc-users mailing list
 hwloc-us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

>>>
>>>
>>>
>>>   --
>>> Ing. Gabriele Fatigati
>>>
>>> HPC specialist
>>>
>>> SuperComputing Applications and Innovation Department
>>>
>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>
>>> www.cineca.itTel:   +39 051 
>>> 6171722<%2B39%20051%206171722>
>>>
>>> g.fatigati [AT] cineca.it
>>>
>>
>>
>>
>>  --
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.itTel:   +39 051 
>> 6171722<%2B39%20051%206171722>
>>
>> g.fatigati [AT] cineca.it
>>
>
>
>
>  --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.itTel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>
>
> ___
> hwloc-users mailing 
> listhwloc-users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


output_valgrind
Description: Binary data
//#include 
#include 
#include 
#include 


#define PAGE_SIZE 4096

int main(int argc,char *argv[]){


	/* Bind memory example: each thread bind a piece of allocated memory in local node
	 */

	//MPI_Init (&argc, &argv);
	int rank;
	int resu

Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Samuel Thibault
Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
> mbind hwloc_linux_set_area_membind()  fails:
> 
> Error from HWLOC mbind: Cannot allocate memory 

Ok. mbind is not really supposed to allocate much memory, but it still
does allocate some, to record the policy

> //hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, 
> tid);
> hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
> hwloc_bitmap_singlify(cpuset);
> hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
> 
> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
> //   res = hwloc_set_area_membind_nodeset(topology, &array[i], 
> PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
>  res = hwloc_set_area_membind(topology, &array[i], PAGE_SIZE, 
> cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);

and I'm afraid that calling set_area_membind for each page might be too
dense: the kernel is probably allocating a memory policy record for each
page, not being able to merge adjacent equal policies.

You could check in /proc/meminfo which number goes high, it's probably
in-kernel data, such as the Slab.

Samuel


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Samuel Thibault
Samuel Thibault, le Thu 06 Sep 2012 10:45:45 +0200, a écrit :
> Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
> > mbind hwloc_linux_set_area_membind()  fails:
> > 
> > Error from HWLOC mbind: Cannot allocate memory 
> 
> Ok. mbind is not really supposed to allocate much memory, but it still
> does allocate some, to record the policy
> 
> > //hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, 
> > tid);
> > hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, 
> > tid);
> > hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
> > hwloc_bitmap_singlify(cpuset);
> > hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
> > 
> > for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
> > //   res = hwloc_set_area_membind_nodeset(topology, &array[i], 
> > PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
> >  res = hwloc_set_area_membind(topology, &array[i], PAGE_SIZE, 
> > cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
> 
> and I'm afraid that calling set_area_membind for each page might be too
> dense: the kernel is probably allocating a memory policy record for each
> page, not being able to merge adjacent equal policies.

I forgot to mention: the amount of memory allocated by each mbind call
can be controlled through the configured maximum number of nodes in the
kernel (CONFIG_NODES_SHIFT).

Samuel


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
Le 06/09/2012 10:44, Samuel Thibault a écrit :
> Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
>> mbind hwloc_linux_set_area_membind()  fails:
>>
>> Error from HWLOC mbind: Cannot allocate memory 
> Ok. mbind is not really supposed to allocate much memory, but it still
> does allocate some, to record the policy
>
>> //hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, 
>> tid);
>> hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
>> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
>> hwloc_bitmap_singlify(cpuset);
>> hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
>> 
>> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
>> //   res = hwloc_set_area_membind_nodeset(topology, &array[i], 
>> PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
>>  res = hwloc_set_area_membind(topology, &array[i], PAGE_SIZE, 
>> cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
> and I'm afraid that calling set_area_membind for each page might be too
> dense: the kernel is probably allocating a memory policy record for each
> page, not being able to merge adjacent equal policies.
>

It's supposed to merge VMA with same policies (from what I understand in
the code), but I don't know if that actually works.
Maybe Gabriele found a kernel bug :)

Brice



Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Gabriele Fatigati
I did't find any strange number in /proc/meminfo.

I've noted that the program fails exactly
every 65479 hwloc_set_area_membind. So It sounds like some kernel limit.
You can check that also just one thread.

Maybe never has not noted them  because usually we bind a large amount of
contiguos memory few times, instead of small and non contiguos pieces of
memory many and many times.. :(

2012/9/6 Brice Goglin 

> Le 06/09/2012 10:44, Samuel Thibault a écrit :
> > Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
> >> mbind hwloc_linux_set_area_membind()  fails:
> >>
> >> Error from HWLOC mbind: Cannot allocate memory
> > Ok. mbind is not really supposed to allocate much memory, but it still
> > does allocate some, to record the policy
> >
> >> //hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
> HWLOC_OBJ_NODE, tid);
> >> hwloc_obj_t obj = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU,
> tid);
> >> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
> >> hwloc_bitmap_singlify(cpuset);
> >> hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
> >>
> >> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
> >> //   res = hwloc_set_area_membind_nodeset(topology, &array[i],
> PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
> >>  res = hwloc_set_area_membind(topology, &array[i],
> PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
> > and I'm afraid that calling set_area_membind for each page might be too
> > dense: the kernel is probably allocating a memory policy record for each
> > page, not being able to merge adjacent equal policies.
> >
>
> It's supposed to merge VMA with same policies (from what I understand in
> the code), but I don't know if that actually works.
> Maybe Gabriele found a kernel bug :)
>
> Brice
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
Le 06/09/2012 12:19, Gabriele Fatigati a écrit :
> I did't find any strange number in /proc/meminfo.
>
> I've noted that the program fails exactly
> every 65479 hwloc_set_area_membind. So It sounds like some kernel
> limit. You can check that also just one thread.
>
> Maybe never has not noted them  because usually we bind a large amount
> of contiguos memory few times, instead of small and non contiguos
> pieces of memory many and many times.. :(

If you have root access, try (as root)
watch -n 1 grep numa_policy /proc/slabinfo
Put a sleep(10) in your program when set_area_membind() fails, and don't
let your program exit before you can read the content of /proc/slabinfo.

Brice



>
> 2012/9/6 Brice Goglin  >
>
> Le 06/09/2012 10:44, Samuel Thibault a écrit :
> > Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
> >> mbind hwloc_linux_set_area_membind()  fails:
> >>
> >> Error from HWLOC mbind: Cannot allocate memory
> > Ok. mbind is not really supposed to allocate much memory, but it
> still
> > does allocate some, to record the policy
> >
> >> //hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
> HWLOC_OBJ_NODE, tid);
> >> hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
> HWLOC_OBJ_PU, tid);
> >> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
> >> hwloc_bitmap_singlify(cpuset);
> >> hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
> >>
> >> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
> >> //   res = hwloc_set_area_membind_nodeset(topology,
> &array[i], PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND,
> HWLOC_MEMBIND_THREAD);
> >>  res = hwloc_set_area_membind(topology, &array[i],
> PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
> > and I'm afraid that calling set_area_membind for each page might
> be too
> > dense: the kernel is probably allocating a memory policy record
> for each
> > page, not being able to merge adjacent equal policies.
> >
>
> It's supposed to merge VMA with same policies (from what I
> understand in
> the code), but I don't know if that actually works.
> Maybe Gabriele found a kernel bug :)
>
> Brice
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org 
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it   
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users



Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Gabriele Fatigati
Hi Brice,

the initial grep is:

numa_policy65671  65952 24  1441 : tunables  120   608
: slabdata458458  0

When set_membind fails is:

numa_policy  482   1152 24  1441 : tunables  120   608
: slabdata  8  8288

What does it means?



2012/9/6 Brice Goglin 

>  Le 06/09/2012 12:19, Gabriele Fatigati a écrit :
>
> I did't find any strange number in /proc/meminfo.
>
>  I've noted that the program fails exactly
> every 65479 hwloc_set_area_membind. So It sounds like some kernel limit.
> You can check that also just one thread.
>
>  Maybe never has not noted them  because usually we bind a large amount
> of contiguos memory few times, instead of small and non contiguos pieces of
> memory many and many times.. :(
>
>
> If you have root access, try (as root)
> watch -n 1 grep numa_policy /proc/slabinfo
> Put a sleep(10) in your program when set_area_membind() fails, and don't
> let your program exit before you can read the content of /proc/slabinfo.
>
> Brice
>
>
>
>
>
>  2012/9/6 Brice Goglin 
>
>> Le 06/09/2012 10:44, Samuel Thibault a écrit :
>> > Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
>> >> mbind hwloc_linux_set_area_membind()  fails:
>> >>
>> >> Error from HWLOC mbind: Cannot allocate memory
>> > Ok. mbind is not really supposed to allocate much memory, but it still
>> > does allocate some, to record the policy
>> >
>> >> //hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
>> HWLOC_OBJ_NODE, tid);
>> >> hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
>> HWLOC_OBJ_PU, tid);
>> >> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
>> >> hwloc_bitmap_singlify(cpuset);
>> >> hwloc_set_cpubind(topology, cpuset, HWLOC_CPUBIND_THREAD);
>> >>
>> >> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
>> >> //   res = hwloc_set_area_membind_nodeset(topology, &array[i],
>> PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
>> >>  res = hwloc_set_area_membind(topology, &array[i],
>> PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD);
>> > and I'm afraid that calling set_area_membind for each page might be too
>> > dense: the kernel is probably allocating a memory policy record for each
>> > page, not being able to merge adjacent equal policies.
>> >
>>
>>  It's supposed to merge VMA with same policies (from what I understand in
>> the code), but I don't know if that actually works.
>> Maybe Gabriele found a kernel bug :)
>>
>> Brice
>>
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>
>
>
>  --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.itTel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>
>
> ___
> hwloc-users mailing 
> listhwloc-users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread binding problem

2012-09-06 Thread Brice Goglin
Le 06/09/2012 14:51, Gabriele Fatigati a écrit :
> Hi Brice,
>
> the initial grep is:
>
> numa_policy65671  65952 24  1441 : tunables  120   60
>8 : slabdata458458  0
>
> When set_membind fails is:
>
> numa_policy  482   1152 24  1441 : tunables  120   60
>8 : slabdata  8  8288
>
> What does it means?

The first number is the number of active objects. That means 65000
mempolicy objects were in use on the first line.
(I wonder if you swapped the lines, I expected higher numbers at the end
of the run)

Anyway, having 65000 mempolicies in use is a lot. And that would somehow
correspond to the number of set_area_membind that succeeed before one
fails. So the kernel might indeed fail to merge those.

That said, these objects are small (24bytes here if I am reading things
correctly), so we're talking about 1,6MB only here. So there's still
something else eating all the memory. /proc/meminfo (MemFree) and
numactl -H should again help.

Brice


>
>
>
> 2012/9/6 Brice Goglin  >
>
> Le 06/09/2012 12:19, Gabriele Fatigati a écrit :
>> I did't find any strange number in /proc/meminfo.
>>
>> I've noted that the program fails exactly
>> every 65479 hwloc_set_area_membind. So It sounds like some kernel
>> limit. You can check that also just one thread.
>>
>> Maybe never has not noted them  because usually we bind a large
>> amount of contiguos memory few times, instead of small and non
>> contiguos pieces of memory many and many times.. :(
>
> If you have root access, try (as root)
> watch -n 1 grep numa_policy /proc/slabinfo
> Put a sleep(10) in your program when set_area_membind() fails, and
> don't let your program exit before you can read the content of
> /proc/slabinfo.
>
> Brice
>
>
>
>
>>
>> 2012/9/6 Brice Goglin > >
>>
>> Le 06/09/2012 10:44, Samuel Thibault a écrit :
>> > Gabriele Fatigati, le Thu 06 Sep 2012 10:12:38 +0200, a écrit :
>> >> mbind hwloc_linux_set_area_membind()  fails:
>> >>
>> >> Error from HWLOC mbind: Cannot allocate memory
>> > Ok. mbind is not really supposed to allocate much memory,
>> but it still
>> > does allocate some, to record the policy
>> >
>> >> //hwloc_obj_t obj =
>> hwloc_get_obj_by_type(topology, HWLOC_OBJ_NODE, tid);
>> >> hwloc_obj_t obj = hwloc_get_obj_by_type(topology,
>> HWLOC_OBJ_PU, tid);
>> >> hwloc_cpuset_t cpuset = hwloc_bitmap_dup(obj->cpuset);
>> >> hwloc_bitmap_singlify(cpuset);
>> >> hwloc_set_cpubind(topology, cpuset,
>> HWLOC_CPUBIND_THREAD);
>> >>
>> >> for( i = chunk*tid; i < len; i+=PAGE_SIZE) {
>> >> //   res =
>> hwloc_set_area_membind_nodeset(topology, &array[i],
>> PAGE_SIZE, obj->nodeset, HWLOC_MEMBIND_BIND,
>> HWLOC_MEMBIND_THREAD);
>> >>  res = hwloc_set_area_membind(topology,
>> &array[i], PAGE_SIZE, cpuset, HWLOC_MEMBIND_BIND,
>> HWLOC_MEMBIND_THREAD);
>> > and I'm afraid that calling set_area_membind for each page
>> might be too
>> > dense: the kernel is probably allocating a memory policy
>> record for each
>> > page, not being able to merge adjacent equal policies.
>> >
>>
>> It's supposed to merge VMA with same policies (from what I
>> understand in
>> the code), but I don't know if that actually works.
>> Maybe Gabriele found a kernel bug :)
>>
>> Brice
>>
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>>
>>
>>
>>
>> -- 
>> Ing. Gabriele Fatigati
>>
>> HPC specialist
>>
>> SuperComputing Applications and Innovation Department
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel:  
>> +39 051 6171722 
>>
>> g.fatigati [AT] cineca.it   
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org 
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org 
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it