Dear Jeff, I don't think is a simply out of memory since NUMA node has 48 GB, and I'm allocating just 8 GB.
2012/9/5 Jeff Squyres <jsquy...@cisco.com> > Perhaps you simply have run out of memory on that NUMA node, and therefore > the malloc failed. Check "numactl --hardware", for example. > > You might want to check the output of numastat to see if one or more of > your NUMA nodes have run out of memory. > > > On Sep 5, 2012, at 12:58 PM, Gabriele Fatigati wrote: > > > I've reproduced the problem in a small MPI + OpenMP code. > > > > The error is the same: after some memory bind, gives "Cannot allocate > memory". > > > > Thanks. > > > > 2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it> > > Downscaling the matrix size, binding works well, but the memory > available is enought also using more big matrix, so I'm a bit confused. > > > > Using the same big matrix size without binding the code works well, so > how I can explain this behaviour? > > > > Maybe hwloc_set_area_membind_nodeset introduces other extra allocation > that are resilient after the call? > > > > > > > > 2012/9/5 Brice Goglin <brice.gog...@inria.fr> > > An internal malloc failed then. That would explain why your malloc > failed too. > > It looks like you malloc'ed too much memory in your program? > > > > Brice > > > > > > > > > > Le 05/09/2012 15:56, Gabriele Fatigati a écrit : > >> An update: > >> > >> placing strerror(errno) after hwloc_set_area_membind_nodeset gives: > "Cannot allocate memory" > >> > >> 2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it> > >> Hi, > >> > >> I've noted that hwloc_set_area_membind_nodeset return -1 but errno is > not equal to EXDEV or ENOSYS. I supposed that these two case was the two > unique possibly. > >> > >> From the hwloc documentation: > >> > >> -1 with errno set to ENOSYS if the action is not supported > >> -1 with errno set to EXDEV if the binding cannot be enforced > >> > >> > >> Any other binding failure reason? The memory available is enought. > >> > >> 2012/9/5 Brice Goglin <brice.gog...@inria.fr> > >> Hello Gabriele, > >> > >> The only limit that I would think of is the available physical memory > on each NUMA node (numactl -H will tell you how much of each NUMA node > memory is still available). > >> malloc usually only fails (it returns NULL?) when there no *virtual* > memory anymore, that's different. If you don't allocate tons of terabytes > of virtual memory, this shouldn't happen easily. > >> > >> Brice > >> > >> > >> > >> > >> Le 05/09/2012 14:27, Gabriele Fatigati a écrit : > >>> Dear Hwloc users and developers, > >>> > >>> > >>> I'm using hwloc 1.4.1 on a multithreaded program in a Linux platform, > where each thread bind many non contiguos pieces of a big matrix using in a > very intensive way hwloc_set_area_membind_nodeset function: > >>> > >>> hwloc_set_area_membind_nodeset(topology, punt+offset, len, nodeset, > HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_THREAD | HWLOC_MEMBIND_MIGRATE); > >>> > >>> Binding seems works well, since the returned code from function is 0 > for every calls. > >>> > >>> The problems is that after binding, a simple little new malloc fails, > without any apparent reason. > >>> > >>> Disabling memory binding, the allocations works well. Is there any > knows problem if hwloc_set_area_membind_nodeset is used intensively? > >>> > >>> Is there some operating system limit for memory pages binding? > >>> > >>> Thanks in advance. > >>> > >>> -- > >>> Ing. Gabriele Fatigati > >>> > >>> HPC specialist > >>> > >>> SuperComputing Applications and Innovation Department > >>> > >>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > >>> > >>> www.cineca.it Tel: +39 051 6171722 > >>> > >>> g.fatigati [AT] cineca.it > >>> > >>> > >>> _______________________________________________ > >>> hwloc-users mailing list > >>> > >>> hwloc-us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users > >> > >> > >> > >> > >> -- > >> Ing. Gabriele Fatigati > >> > >> HPC specialist > >> > >> SuperComputing Applications and Innovation Department > >> > >> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > >> > >> www.cineca.it Tel: +39 051 6171722 > >> > >> g.fatigati [AT] cineca.it > >> > >> > >> > >> -- > >> Ing. Gabriele Fatigati > >> > >> HPC specialist > >> > >> SuperComputing Applications and Innovation Department > >> > >> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > >> > >> www.cineca.it Tel: +39 051 6171722 > >> > >> g.fatigati [AT] cineca.it > > > > > > > > > > -- > > Ing. Gabriele Fatigati > > > > HPC specialist > > > > SuperComputing Applications and Innovation Department > > > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > > > www.cineca.it Tel: +39 051 6171722 > > > > g.fatigati [AT] cineca.it > > > > > > > > -- > > Ing. Gabriele Fatigati > > > > HPC specialist > > > > SuperComputing Applications and Innovation Department > > > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > > > www.cineca.it Tel: +39 051 6171722 > > > > g.fatigati [AT] cineca.it > > <main_hybrid_bind_mem.c>_______________________________________________ > > hwloc-users mailing list > > hwloc-us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > hwloc-users mailing list > hwloc-us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users > -- Ing. Gabriele Fatigati HPC specialist SuperComputing Applications and Innovation Department Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy www.cineca.it Tel: +39 051 6171722 g.fatigati [AT] cineca.it