Downsizing the array, up to 4GB,
valgrind gives many warnings reported in the attached file.
2012/9/6 Gabriele Fatigati <g.fatig...@cineca.it>
> Sorry,
>
> I used a wrong hwloc installation. Using the hwloc with the printf
> controls:
>
> mbind hwloc_linux_s
=output_valgrind --leak-check=full
--tool=memcheck --show-reachable=yes ./main_hybrid_bind_mem
2012/9/6 Gabriele Fatigati <g.fatig...@cineca.it>
> Hi Brice, hi Jeff,
>
> >Can you add some printf inside hwloc_linux_set_area_membind() in
> src/topology-linux.c to see
are", for example.
>
> You might want to check the output of numastat to see if one or more of
> your NUMA nodes have run out of memory.
>
>
> On Sep 5, 2012, at 12:58 PM, Gabriele Fatigati wrote:
>
> > I've reproduced the problem in a small MPI + OpenMP code.
> >
&
I've reproduced the problem in a small MPI + OpenMP code.
The error is the same: after some memory bind, gives "Cannot allocate
memory".
Thanks.
2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it>
> Downscaling the matrix size, binding works well, but the memory available
>
;
> Le 05/09/2012 15:56, Gabriele Fatigati a écrit :
>
> An update:
>
> placing strerror(errno) after hwloc_set_area_membind_nodeset gives:
> "Cannot allocate memory"
>
> 2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it>
>
>> Hi,
>>
>>
An update:
placing strerror(errno) after hwloc_set_area_membind_nodeset gives:
"Cannot allocate memory"
2012/9/5 Gabriele Fatigati <g.fatig...@cineca.it>
> Hi,
>
> I've noted that hwloc_set_area_membind_nodeset return -1 but errno is not
> equal to EXDEV or ENOSY
how much of each NUMA node memory
> is still available).
> malloc usually only fails (it returns NULL?) when there no *virtual*
> memory anymore, that's different. If you don't allocate tons of terabytes
> of virtual memory, this shouldn't happen easily.
>
> Brice
>
>
>
&g
Just another things:
The id showed in the GPU box from lstopo, is the same device_id CUDA
numeration used in some function like setDevice() for example?
More better:
gpu 1 from lstopo = ? gpu 1 for CUDA runtime?
Thanks.
2012/8/29 Gabriele Fatigati <g.fatig...@cineca.it>
> Good
Good!
Now it works well.
Many tanks!
2012/8/28 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Tue 28 Aug 2012 18:10:41 +0200, a écrit :
> > How can cuda branch help me? lstopo output of that branch is the same of
> the
> > trunk.
>
> You need
e a dual Xeon X56xx Westmere machine, there
> > are plenty of such platforms where the GPU is indeed connected to both
> > sockets. Or it could be a buggy BIOS.
>
> Agreed.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://
Dear hwloc user,
I'm using hwloc 1.5. I would to see how GPUs are connected with the
processor socket using lstopo command.
I attach the figure. The system has two GPUs, but I don't understand how to
find that information from PCI boxes.
Thanks in advance.
--
Ing. Gabriele Fatigati
HPC
; Le 25/09/2011 12:41, Gabriele Fatigati a écrit :
>
>
>>* doing two set_area_membind on the same entire array is useless, the
> second one will overwrite the first one.
>
> But set_area_membind is for memory in general, not for a particular
> malloc. ( I
2011/9/25 Gabriele Fatigati <g.fatig...@cineca.it>
>
>> * doing two set_area_membind on the same entire array is useless, the
> second one will overwrite the first one.
>
> But set_area_membind is for memory in general, not for a particular malloc.
> ( Is it rig
locations, and set_area_membind for thread 2 for futures
allocations.
set_membind done by thread 2 has no reference with malloc(array) done by
first thread, so why it influence the real allocation of this array?
2011/9/25 Brice Goglin <brice.gog...@inria.fr>
> **
> Le 25/09/2011 12:
cause in the second
example only first touch appears to have some effects, indipendently which
hwloc function I'm using.
Sorry, but it is quite difficult to understand .. :(
2011/9/25 Brice Goglin <brice.gog...@inria.fr>
> **
> Le 25/09/2011 11:14, Gabriele Fatigati a écrit
ll futures allocation without call this function each time
I allocate some memory. Is it possible to do this?
2011/9/22 Brice Goglin <brice.gog...@inria.fr>
> Le 22/09/2011 12:20, Gabriele Fatigati a écrit :
> > NUMA node(s) near the specified cpuset.
> >
> > What do
ree memory on the nodes decrease only on the node where the second
thread is. Is it rigth?
hwloc_set_membind involves all future allocations?
Thanks in forward.
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno
at you
should now see with get_cpubind is >that process X is now bound to cores
A+B, thread Y to B, and all other threads to A.
2011/9/12 Brice Goglin <brice.gog...@inria.fr>
> Le 12/09/2011 14:17, Gabriele Fatigati a écrit :
> > Mm, and why? In a hybrid code ( MPI + Ope
code ( MPI + OpenMP), my idea is to bind a single
MPI process in one core, and his threads in other cores. Otherwise I have
all threads that runs on a single core..
2011/9/12 Brice Goglin <brice.gog...@inria.fr>
> **
> Le 12/09/2011 13:58, Gabriele Fatigati a écrit :
>
>
and thread are on the same NUMA node, works well, also on different cores.
If the NUMA node of process is different of NUMA node of threads, there is a
problem.
2011/9/12 Brice Goglin <brice.gog...@inria.fr>
> **
> Le 12/09/2011 13:29, Gabriele Fatigati a écrit :
>
> Hi
y that would cause a segfault when checking).
>
> If you really need something like this, put an integer value on the side of
> the topology variable, and make 0 or 1 depending on whether the topology was
> init or not.
>
> Brice
>
>
> - Reply message -
> De : &
en declaring the variable. It will be changed
> into something else when init() is called.
>
> Brice
>
> - Reply message -
> De : "Gabriele Fatigati" <g.fatig...@cineca.it>
> Pour : "Hardware locality user list" <hwloc-us...@open-mpi.org>
>
.
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.itTel: +39 051 6171722
g.fatigati [AT] cineca.it
ur
> application's fault.
> Brice
>
>
>
> Le 13/08/2011 10:37, Gabriele Fatigati a écrit :
>
>
>
> Dearhwloc users and developers,
>
> I'm using hwloc 1.2 stable version Intel 11 compiled and checking my
> little application with va
(bind.c:396)
==2904==by 0x401CBB: bind_memory_tonode (main.c:97)
valgrind has --tool=memcheck --leak-check=full exec flags.
It give me the same warning also with just one byte memory bound.
Is it a hwloc warning or my applications warning?
Thanks in forward.
--
Ing. Gabriele Fatigati
Of course,
with gettid() works well.
Thanks so much!
2011/8/11 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Thu 11 Aug 2011 18:05:25 +0200, a écrit :
> > char* bitmap_string=(char*)malloc(256);
> >
> > hwloc_bitmap
tring, tid);
--
2011/8/11 Gabriele Fatigati <g.fatig...@cineca.it>
> Hi Samuel,
>
> I'm using as it in OpenMP parallel region:
>
>
> -
>
> char* bitmap_string=(char*)malloc(256);
>
> hwloc_bitmap_t set = hwloc_bitmap_alloc
%d \n", bitmap_string[0], tid);
2011/8/11 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Thu 11 Aug 2011 10:32:23 +0200, a écrit :
> > I'm using hwloc-1.3a1r3606. Now hwloc_get_last_cpu_location() works
> well:
> >
> &g
Ok,
thanks!
2011/8/10 Samuel Thibault <samuel.thiba...@inria.fr>
> Samuel Thibault, le Wed 10 Aug 2011 16:24:39 +0200, a écrit :
> > Gabriele Fatigati, le Wed 10 Aug 2011 16:13:27 +0200, a écrit :
> > > there is something wrong. I'm using two thread
U 2 and 10 working, so bind has worked
well.
2011/8/10 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Wed 10 Aug 2011 15:41:19 +0200, a écrit :
> > hwloc_cpuset_t set = hwloc_bitmap_alloc();
> >
> > int return_value = hwloc_get_last_cpu_location(top
o, CPU 0 I suppose, but is not where i bound my thread .. :(
2011/8/10 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Wed 10 Aug 2011 15:29:43 +0200, a écrit :
> > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_MACHINE, 0);
ess/threads
runs. Is it right?
2011/8/10 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Wed 10 Aug 2011 09:35:19 +0200, a écrit :
> > these lines, doesn't works:
> >
> > set = hwloc_bitmap_alloc();
> > hwloc_get_cpubind(topology, , 0);
> >
, and hwloc_get_last_cpu_location()
give me CPU index where process/thread runs from cpuset passed. It is right?
The phylosophy of these function are
2011/8/9 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Tue 09 Aug 2011 18:14:55 +0200, a écrit :
> > hwloc_get_cpu
ple to use it?
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.itTel: +39 051 6171722
g.fatigati [AT] cineca.it
>There is no difference concerning the cpuset.
It means they have the same logical index?
2011/8/9 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Tue 09 Aug 2011 16:58:33 +0200, a écrit :
> > in a non SMT machine, what's the difference betw
Dear hwloc users,
in a non SMT machine, what's the difference between HWLOC_OBJ_CORE
and HWLOC_OBJ_PU?
can I exchange one to other?
Thanks.
--
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
Well,
now it's more clear.
Thanks for the informations!
Regards.
2011/8/4 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Thu 04 Aug 2011 16:56:22 +0200, a écrit :
> > L#0 and L#1 are physically near because hwloc consider shared caches map
> when
L#0 and L#1 are physically near because hwloc consider shared caches map
when build topology? Because if not, i don't know how hwloc understand the
physical proximity of cores :(
2011/8/4 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Thu 04 Aug 2011 16:35:36 +0200
.
2011/8/4 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Thu 04 Aug 2011 16:14:35 +0200, a écrit :
> > Socket:
> > __
> >| |
> >| |core | |core ||
> >| _
in a single socket are
physically near.
2011/8/4 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Thu 04 Aug 2011 15:52:09 +0200, a écrit :
> > how the topology gave by lstopo is built? In particolar, how the logical
> index
> > P# are initialized
ind(*topology, set, HWLOC_CPUBIND_THREAD |
HWLOC_CPUBIND_NOMEMBIND);
}
2011/8/2 Gabriele Fatigati <g.fatig...@cineca.it>
> Mm, i'm not sure. Suppose this:
>
> $pragma omp parallel num_thread(1)
> {
> hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD
Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Tue 02 Aug 2011 16:23:12 +0200, a écrit :
> > hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_THREAD |
> HWLOC_CPUBIND_STRICT
> > | HWLOC_CPUBIND_NOMEMBIND);
> >
> > is it possible do multiple call to
);
hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_STRICT);
hwloc_set_cpubind(*topology, set, HWLOC_CPUBIND_NOMEMBIND);
or only the last have effect?
Thanks in forward.
--
Ing. Gabriele Fatigati
Parallel programmer
CINECA Systems & Tecnologies Department
Supercomputing Group
Via Magnan
/8/1 Brice Goglin <brice.gog...@inria.fr>
> **
> "PU P#0" means "PU object with physical index 0".
> "P#" prefix means "physical index".
> "L#" prefix means "logical index" (the one you want to use in
> get_obj_by_
Hi Brice,
so, if I inderstand well, PU P# numbers are not the same specified as
HWLOC_OBJ_PU flag?
2011/8/1 Brice Goglin <brice.gog...@inria.fr>
> Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
> > Hi,
> >
> > reading a hwloc-v1.2-a4 manual, on page 15, i look an
cutive and not
exclusive, I suppose is better and more sure to use PU id. Or not?
2011/7/29 Samuel Thibault <samuel.thiba...@inria.fr>
> Gabriele Fatigati, le Fri 29 Jul 2011 13:24:17 +0200, a écrit :
> > yhanks for yout quick reply!
> >
> > But i have a litte doubt. i
1/7/29 Samuel Thibault <samuel.thiba...@inria.fr>
> Hello,
>
> Gabriele Fatigati, le Fri 29 Jul 2011 12:43:47 +0200, a écrit :
> > I'm so confused. I see couples of cores with the same core id! ( Core#8
> for
> > example) How is it possible?
>
> Th
with:
hwloc_set_cpubind(topology, set, HWLOC_CPUBIND_THREAD);
and crash with:
hwloc_set_thread_cpubind(topology, tid, set, HWLOC_CPUBIND_THREAD);
Thanks in forward.
--
Ing. Gabriele Fatigati
Parallel programmer
CINECA Systems & Tecnologies Department
Supercomputing Group
Via Magnanelli 6/3, Casalecchi
48 matches
Mail list logo