Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Gabriele Fatigati
Well,

now it's more clear.

Thanks for the informations!

Regards.

2011/8/4 Samuel Thibault 

> Gabriele Fatigati, le Thu 04 Aug 2011 16:56:22 +0200, a écrit :
> > L#0 and L#1 are physically near because hwloc consider shared caches map
> when
> > build topology?
>
> Yes. That's the whole point of sorting objects topologically first, and
> numbering them afterwards. See the glossary entry for "logical index":
>
> “The ordering is based on topology first, and then on OS CPU numbers”
>
> I.e. OS CPU numbers are only used when no topology information (shared
> cache etc.) provides any better sorting.
>
> > Because if not, i don't know how hwloc understand the physical
> > proximity of cores :(
>
> Physical proximity of cores does not mean logical proximity. cores can
> be next one to the other, and still share no cache at all. Forget the
> expression "physical proximity", it does not provide any interesting
> information. What matters is logical proximity. And that's *precisely*
> what logical indexes express.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Samuel Thibault
Gabriele Fatigati, le Thu 04 Aug 2011 16:56:22 +0200, a écrit :
> L#0 and L#1 are physically near because hwloc consider shared caches map when
> build topology?

Yes. That's the whole point of sorting objects topologically first, and
numbering them afterwards. See the glossary entry for "logical index":

“The ordering is based on topology first, and then on OS CPU numbers”

I.e. OS CPU numbers are only used when no topology information (shared
cache etc.) provides any better sorting.

> Because if not, i don't know how hwloc understand the physical
> proximity of cores :(

Physical proximity of cores does not mean logical proximity. cores can
be next one to the other, and still share no cache at all. Forget the
expression "physical proximity", it does not provide any interesting
information. What matters is logical proximity. And that's *precisely*
what logical indexes express.

Samuel


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Gabriele Fatigati
L#0 and L#1 are physically near because hwloc consider shared caches map
when build topology? Because if not, i don't know how hwloc understand the
physical proximity of cores :(

2011/8/4 Samuel Thibault 

> Gabriele Fatigati, le Thu 04 Aug 2011 16:35:36 +0200, a écrit :
> > so physical OS index 0 and 1 are not true are physically near on the die.
>
> They quite often aren't. See the updated glossary of the documentation:
>
> "The index that the operating system (OS) uses to identify the object.
> This may be completely arbitrary, non-unique, non-contiguous, not
> representative of proximity, and may depend on the BIOS configuration."
>
> > Considering that, how I can use cache locality and cache sharing by cores
> if I
> > don't know where my threads will physically bound?
>
> By using logical indexes, not physical indexes. And almost all hwloc
> functions use logical indexes, not physical indexes.
>
> > If L#0 and L#1  where I bind my threads are physically far, may give me
> bad
> > performance.
>
> L#0 and L#1 are physically near, that's precisely the whole point of
> hwloc: it provides you with *logical* indexes which express proximity,
> instead of the P#0 and P#1 physical/OS indexes, which are quite often
> simply arbitrary.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Samuel Thibault
Gabriele Fatigati, le Thu 04 Aug 2011 16:35:36 +0200, a écrit :
> so physical OS index 0 and 1 are not true are physically near on the die.

They quite often aren't. See the updated glossary of the documentation:

"The index that the operating system (OS) uses to identify the object.
This may be completely arbitrary, non-unique, non-contiguous, not
representative of proximity, and may depend on the BIOS configuration."

> Considering that, how I can use cache locality and cache sharing by cores if I
> don't know where my threads will physically bound?

By using logical indexes, not physical indexes. And almost all hwloc
functions use logical indexes, not physical indexes.

> If L#0 and L#1  where I bind my threads are physically far, may give me bad
> performance.

L#0 and L#1 are physically near, that's precisely the whole point of
hwloc: it provides you with *logical* indexes which express proximity,
instead of the P#0 and P#1 physical/OS indexes, which are quite often
simply arbitrary.

Samuel


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Gabriele Fatigati
Ok,

so physical OS index 0 and 1 are not true are physically near on the die.

Considering that, how I can use cache locality and cache sharing by cores if
I don't know where my threads will physically bound?

If L#0 and L#1  where I bind my threads are physically far, may give me bad
performance.

2011/8/4 Samuel Thibault 

> Gabriele Fatigati, le Thu 04 Aug 2011 16:14:35 +0200, a écrit :
> > Socket:
> > __
> >|      |
> >| |core |  |core ||
> >|      |
> >| |core | |core | |
> >|      |
> >| |core | |core | |
> >| __|
> >
> > lstopo how create the numerations?
>
> It does not really matter since there is no topology consideration here
> (no shared cache or such).  In that case hwloc will simply follow the
> order as provided by the OS. If there were shared caches, they would
> come into play when sorting the topology.
>
> > It consider physical OS index to list and create cores topology? If
> > yes, maybe Core L#0  and Core L#1  in a single socket are physically
> > near.
>
> Mmm, maybe the confusion comes from the expression "physically near".
> What we call physical index have nothing to do with physical proximity.
> It's just what the physical chip says, which often has nothing to do
> with physical proximity.
>
> There is nothing much fancy in the topology creation really: we simply
> include objects one into the other according to logical inclusion, and
> finally sort by OS (i.e. physical) index after it's all topology-sorted.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Gabriele Fatigati
Ok,

but i dont' understand how lstopo works. Suppose on the physical  die the
disposition of my cores  non SMT) are like this:

Socket:
__
   |      |
   | |*core* |  |*core* ||
   |      |
   | |*core* | |*core* | |
   |      |
   | |*core* | |*core* | |
   | __|

lstopo how create the numerations? (sorry for the horrible figure). How the
numeration start? It consider physical OS index to list and create cores
topology? If yes, maybe Core L#0  and Core L#1  in a single socket are
physically near.



2011/8/4 Samuel Thibault 

> Gabriele Fatigati, le Thu 04 Aug 2011 15:52:09 +0200, a écrit :
> > how the topology gave by lstopo is built? In particolar, how the logical
> index
> > P# are initialized?
>
> P# are not logical indexes, they are physical indexes, as displayed in
> /proc/cpuinfo & such.
>
> The logical indexes, L#, displayed when passing the -l option to lstopo,
> are numbered simply linearly, after having sorted the PUs according to
> topology.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

HPC specialist

SuperComputing Applications and Innovation Department

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Samuel Thibault
Gabriele Fatigati, le Thu 04 Aug 2011 15:52:09 +0200, a écrit :
> how the topology gave by lstopo is built? In particolar, how the logical index
> P# are initialized?

P# are not logical indexes, they are physical indexes, as displayed in
/proc/cpuinfo & such.

The logical indexes, L#, displayed when passing the -l option to lstopo,
are numbered simply linearly, after having sorted the PUs according to
topology.

Samuel


Re: [hwloc-users] Thread core affinity

2011-08-04 Thread Samuel Thibault
Hello,

Gabriele Fatigati, le Mon 01 Aug 2011 12:32:44 +0200, a écrit :
> So, are not physically near. I aspect that with Hyperthreading, and 2 hardware
> threads each core, PU P#0 and PU P#1 are on the same core.

Since these are P#0 and 1, they may not be indeed (physical indexes).
That's the whole problem of the indexes provided by operating systems.

Fortunately,

> If is it not true,
> using in a OMP PARALLEL region with 2 software threads:
> 
> $ pragma omp paralle num_threads(2)
> 
> tid= omp_get_thread_num();
> 
> hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
> hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
> hwloc_bitmap_singlify(set);
> 
> hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> 
> 
> 
> i would bind thread 0 on PU P#0 and thread 1 on PU P#1, supposing are
> physically near.

No, because hwloc functions do not use physical, but logical indexes,
which it computes according to the topology. Use lstopo --top to check
the actual binding being used.

Samuel


Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
It's just a coincidence. Most modern machines (many of them are NUMA)
have non sequential numbers (to maximize memory bandwidth in the dumb
cases).

Brice




Le 01/08/2011 15:29, Gabriele Fatigati a écrit :
> Ok,
>
> now it's more clear. Just a little question. Why in a NUMA machine,
> PU# are sequential (page 17), and in a non NUMA machine are not
> sequential? ( page 16)
>
> 2011/8/1 Brice Goglin  >
>
> You're confusing object types with index types.
>
> PU is an object type, like Core, Socket, ... "logical processor"
> is a generic name for cores when there's no SMT, hardware threads
> when there's SMT/Hyperthreading, ... PU is basically the smallest
> thing that can run a software thread.
>
> "P#" is just the way you're numbering object, it works for PU and
> for other object types.
>
> Any object of any type can be identified through a unique logical
> index, and possibly non-unique physical index.
>
> We don't often use the name "logical processor" because it's
> indeed confusing. "Processing Unit" is less confusing, that's why
> it's the official name for the smallest objects in hwloc.
>
> Brice
>
>
>
>
>
>
>
> Le 01/08/2011 15:04, Gabriele Fatigati a écrit :
>> Hi Brice,
>>
>> you said:
>>
>> "PU P#0" means "PU object with physical index 0".
>> "P#" prefix means "physical index".
>>
>> But from the hwloc manual, page 58:
>>
>>
>> HWLOC_OBJ_PU: Processing Unit, or (Logical) Processor..
>>
>>
>> but it is in conflict with what you said :(
>>
>>
>> 2011/8/1 Brice Goglin > >
>>
>> "PU P#0" means "PU object with physical index 0".
>> "P#" prefix means "physical index".
>> "L#" prefix means "logical index" (the one you want to use in
>> get_obj_by_type).
>> Use -l or -p to switch from one to the other in lstopo.
>>
>> Brice
>>
>>
>>
>> Le 01/08/2011 14:47, Gabriele Fatigati a écrit :
>>> Hi Brice,
>>>
>>> so, if I inderstand well, PU P# numbers are not  the same
>>> specified  as HWLOC_OBJ_PU flag?
>>>
>>> 2011/8/1 Brice Goglin >> >
>>>
>>> Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
>>> > Hi,
>>> >
>>> > reading a hwloc-v1.2-a4 manual, on page 15, i look an
>>> example
>>> > with 4-socket 2-core machine with hyperthreading.
>>> >
>>> > Core id's are not exclusive as said before. PU's id
>>> are exclusive but
>>> > not physically sequential (I suppose)
>>> >
>>> > PU P#0 is in socket P#0 on Core P#0. PU P#1 is in
>>> another socket!
>>>
>>> These indexes are "physical indexes" (that's the default
>>> in the
>>> graphical lstopo output). But we may want to make that
>>> clearer in the doc.
>>>
>>> Brice
>>>
>>>
>>>
>>>
>>> -- 
>>> Ing. Gabriele Fatigati
>>>
>>> Parallel programmer
>>>
>>> CINECA Systems & Tecnologies Department
>>>
>>> Supercomputing Group
>>>
>>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>>
>>> www.cineca.it    
>>> Tel:   +39 051 6171722
>>>
>>> g.fatigati [AT] cineca.it   
>>
>>
>>
>>
>> -- 
>> Ing. Gabriele Fatigati
>>
>> Parallel programmer
>>
>> CINECA Systems & Tecnologies Department
>>
>> Supercomputing Group
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel:  
>> +39 051 6171722
>>
>> g.fatigati [AT] cineca.it   
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> Parallel programmer
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it   



Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
You're confusing object types with index types.

PU is an object type, like Core, Socket, ... "logical processor" is a
generic name for cores when there's no SMT, hardware threads when
there's SMT/Hyperthreading, ... PU is basically the smallest thing that
can run a software thread.

"P#" is just the way you're numbering object, it works for PU and for
other object types.

Any object of any type can be identified through a unique logical index,
and possibly non-unique physical index.

We don't often use the name "logical processor" because it's indeed
confusing. "Processing Unit" is less confusing, that's why it's the
official name for the smallest objects in hwloc.

Brice







Le 01/08/2011 15:04, Gabriele Fatigati a écrit :
> Hi Brice,
>
> you said:
>
> "PU P#0" means "PU object with physical index 0".
> "P#" prefix means "physical index".
>
> But from the hwloc manual, page 58:
>
>
> HWLOC_OBJ_PU: Processing Unit, or (Logical) Processor..
>
>
> but it is in conflict with what you said :(
>
>
> 2011/8/1 Brice Goglin  >
>
> "PU P#0" means "PU object with physical index 0".
> "P#" prefix means "physical index".
> "L#" prefix means "logical index" (the one you want to use in
> get_obj_by_type).
> Use -l or -p to switch from one to the other in lstopo.
>
> Brice
>
>
>
> Le 01/08/2011 14:47, Gabriele Fatigati a écrit :
>> Hi Brice,
>>
>> so, if I inderstand well, PU P# numbers are not  the same
>> specified  as HWLOC_OBJ_PU flag?
>>
>> 2011/8/1 Brice Goglin > >
>>
>> Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
>> > Hi,
>> >
>> > reading a hwloc-v1.2-a4 manual, on page 15, i look an example
>> > with 4-socket 2-core machine with hyperthreading.
>> >
>> > Core id's are not exclusive as said before. PU's id are
>> exclusive but
>> > not physically sequential (I suppose)
>> >
>> > PU P#0 is in socket P#0 on Core P#0. PU P#1 is in another
>> socket!
>>
>> These indexes are "physical indexes" (that's the default in the
>> graphical lstopo output). But we may want to make that
>> clearer in the doc.
>>
>> Brice
>>
>>
>>
>>
>> -- 
>> Ing. Gabriele Fatigati
>>
>> Parallel programmer
>>
>> CINECA Systems & Tecnologies Department
>>
>> Supercomputing Group
>>
>> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>>
>> www.cineca.it Tel:  
>> +39 051 6171722
>>
>> g.fatigati [AT] cineca.it   
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> Parallel programmer
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it   



Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Gabriele Fatigati
Hi Brice,

you said:

"PU P#0" means "PU object with physical index 0".
"P#" prefix means "physical index".

But from the hwloc manual, page 58:


HWLOC_OBJ_PU: Processing Unit, or (Logical) Processor..


but it is in conflict with what you said :(


2011/8/1 Brice Goglin 

> **
> "PU P#0" means "PU object with physical index 0".
> "P#" prefix means "physical index".
> "L#" prefix means "logical index" (the one you want to use in
> get_obj_by_type).
> Use -l or -p to switch from one to the other in lstopo.
>
> Brice
>
>
>
> Le 01/08/2011 14:47, Gabriele Fatigati a écrit :
>
> Hi Brice,
>
>  so, if I inderstand well, PU P# numbers are not  the same specified  as
> HWLOC_OBJ_PU flag?
>
> 2011/8/1 Brice Goglin 
>
>> Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
>> > Hi,
>> >
>> > reading a hwloc-v1.2-a4 manual, on page 15, i look an example
>> > with 4-socket 2-core machine with hyperthreading.
>> >
>> > Core id's are not exclusive as said before. PU's id are exclusive but
>> > not physically sequential (I suppose)
>> >
>> > PU P#0 is in socket P#0 on Core P#0. PU P#1 is in another socket!
>>
>>  These indexes are "physical indexes" (that's the default in the
>> graphical lstopo output). But we may want to make that clearer in the doc.
>>
>> Brice
>>
>>
>
>
> --
> Ing. Gabriele Fatigati
>
> Parallel programmer
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.itTel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>
>
>


-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
"PU P#0" means "PU object with physical index 0".
"P#" prefix means "physical index".
"L#" prefix means "logical index" (the one you want to use in
get_obj_by_type).
Use -l or -p to switch from one to the other in lstopo.

Brice



Le 01/08/2011 14:47, Gabriele Fatigati a écrit :
> Hi Brice,
>
> so, if I inderstand well, PU P# numbers are not  the same specified
>  as HWLOC_OBJ_PU flag?
>
> 2011/8/1 Brice Goglin  >
>
> Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
> > Hi,
> >
> > reading a hwloc-v1.2-a4 manual, on page 15, i look an example
> > with 4-socket 2-core machine with hyperthreading.
> >
> > Core id's are not exclusive as said before. PU's id are
> exclusive but
> > not physically sequential (I suppose)
> >
> > PU P#0 is in socket P#0 on Core P#0. PU P#1 is in another socket!
>
> These indexes are "physical indexes" (that's the default in the
> graphical lstopo output). But we may want to make that clearer in
> the doc.
>
> Brice
>
>
>
>
> -- 
> Ing. Gabriele Fatigati
>
> Parallel programmer
>
> CINECA Systems & Tecnologies Department
>
> Supercomputing Group
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.it Tel:   +39 051
> 6171722
>
> g.fatigati [AT] cineca.it   



Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Samuel Thibault
Gabriele Fatigati, le Mon 01 Aug 2011 14:48:11 +0200, a écrit :
> so, if I inderstand well, PU P# numbers are not  the same specified  as
> HWLOC_OBJ_PU flag?

They are, in the os_index (aka physical index) field.

Samuel


Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Gabriele Fatigati
Hi Brice,

so, if I inderstand well, PU P# numbers are not  the same specified  as
HWLOC_OBJ_PU flag?

2011/8/1 Brice Goglin 

> Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
> > Hi,
> >
> > reading a hwloc-v1.2-a4 manual, on page 15, i look an example
> > with 4-socket 2-core machine with hyperthreading.
> >
> > Core id's are not exclusive as said before. PU's id are exclusive but
> > not physically sequential (I suppose)
> >
> > PU P#0 is in socket P#0 on Core P#0. PU P#1 is in another socket!
>
> These indexes are "physical indexes" (that's the default in the
> graphical lstopo output). But we may want to make that clearer in the doc.
>
> Brice
>
>


-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-08-01 Thread Brice Goglin
Le 01/08/2011 12:16, Gabriele Fatigati a écrit :
> Hi,
>
> reading a hwloc-v1.2-a4 manual, on page 15, i look an example
> with 4-socket 2-core machine with hyperthreading.
>
> Core id's are not exclusive as said before. PU's id are exclusive but
> not physically sequential (I suppose)
>
> PU P#0 is in socket P#0 on Core P#0. PU P#1 is in another socket!

These indexes are "physical indexes" (that's the default in the
graphical lstopo output). But we may want to make that clearer in the doc.

Brice



Re: [hwloc-users] Thread core affinity

2011-07-29 Thread Samuel Thibault
Gabriele Fatigati, le Fri 29 Jul 2011 13:34:29 +0200, a écrit :
> I forgot to tell you these code block is inside a parallel OpenMP region. This
> is the complete code:
> 
> #pragma omp parallel num_threads(6)
> {
> int tid = omp_get_thread_num();
> 
> hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid);
> 
> and other code block is:
> 
> #pragma omp parallel num_threads(6)
> {
> int tid = omp_get_thread_num();
> 
> hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);

Ok, so it depends whether you want to put your OpenMP threads on
separate cores (then the first code which distributes among cores), or
if you're ok with letting them share a core (then the first code which
distributes among threads).

Maybe try and run lstopo --top to see the result.

Samuel


Re: [hwloc-users] Thread core affinity

2011-07-29 Thread Gabriele Fatigati
Sorry,

I forgot to tell you these code block is inside a parallel OpenMP region.
This is the complete code:

#pragma omp parallel num_threads(6)
{
int tid = omp_get_thread_num();

hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid);
hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
hwloc_bitmap_singlify(set);

hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);

hwloc_bitmap_free(set);

}

and other code block is:

#pragma omp parallel num_threads(6)
{
int tid = omp_get_thread_num();

hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
hwloc_bitmap_singlify(set);

hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);

hwloc_bitmap_free(set);

}


The goal is physically bind threads as near as possible, one thread per
core. Since core ids reported on  hwloc-hello.c are not consecutive and not
exclusive, I suppose is better and more sure to use PU id. Or not?



2011/7/29 Samuel Thibault 

> Gabriele Fatigati, le Fri 29 Jul 2011 13:24:17 +0200, a écrit :
> > yhanks for yout quick reply!
> >
> > But i have a litte doubt. in a non SMT machine, Is it better use this:
> >
> > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid);
> > hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
> > hwloc_bitmap_singlify(set);
> > hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> >
> > or:
> >
> > hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
> > hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
> > hwloc_bitmap_singlify(set);
> > hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> >
> > because work in the same way( i suppose).
>
> They'll both work about the same way on SMT too, since in the end it'll
> pick up only one thread. Whether you want to assign threads to cores or
> threads then depends on your application: do you want to let its threads
> share a core or not.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-07-29 Thread Samuel Thibault
Gabriele Fatigati, le Fri 29 Jul 2011 13:24:17 +0200, a écrit :
> yhanks for yout quick reply!
> 
> But i have a litte doubt. in a non SMT machine, Is it better use this:
> 
> hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid);
> hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
> hwloc_bitmap_singlify(set);
> hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> 
> or:
> 
> hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
> hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
> hwloc_bitmap_singlify(set);
> hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> 
> because work in the same way( i suppose).

They'll both work about the same way on SMT too, since in the end it'll
pick up only one thread. Whether you want to assign threads to cores or
threads then depends on your application: do you want to let its threads
share a core or not.

Samuel


Re: [hwloc-users] Thread core affinity

2011-07-29 Thread Gabriele Fatigati
Hi Samuel,

yhanks for yout quick reply!

But i have a litte doubt. in a non SMT machine, Is it better use this:

hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, tid);
hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
hwloc_bitmap_singlify(set);
hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);

or:

hwloc_obj_t core = hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, tid);
hwloc_cpuset_t set = hwloc_bitmap_dup(core->cpuset);
hwloc_bitmap_singlify(set);
hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);

because work in the same way( i suppose).

2011/7/29 Samuel Thibault 

> Hello,
>
> Gabriele Fatigati, le Fri 29 Jul 2011 12:43:47 +0200, a écrit :
> > I'm so confused. I see couples of cores with the same core id! ( Core#8
> for
> > example)  How is it possible?
>
> That's because they are on different sockets. These are physical IDs
> (not logical IDs), and are thus not garanteed to be unique.
>
> > 2) logical Core id and Physical core id maybe differents. If i want to be
> sure
> > that id 0 and id 1 are physically near, i have to use core id or PU id?
> PU ids
> > are ever physically near?
>
> Using core or thread ID does not matter. What matters is that you take
> the proper ID. Physical IDs will in general never bring you any
> proximity indication. What you want is logical IDs, which hwloc takes
> care of meaning proximity. Using adjacent logical IDs (be it for core or
> threads) will bring you adjacent cores/threads.
>
> > 3) Binding a thread on a core, what's the difference
> between hwloc_set_cpubind
> > () and hwloc_set_thread_cpubind()? More in depth, my code example works
> well
> > with:
> >
> > hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> >
> > and crash with:
> >
> > hwloc_set_thread_cpubind(topology, tid, set,  HWLOC_CPUBIND_THREAD);
>
> Note that tid is hwloc_thread_t, i.e. pthread_t on unixes.
> It is not a (Linux-specific) tid. If what you have is a (Linux-specific)
> tid, use the Linux-specific function, hwloc_linux_set_tid_cpubind.
>
> Samuel
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>



-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it


Re: [hwloc-users] Thread core affinity

2011-07-29 Thread Samuel Thibault
Hello,

Gabriele Fatigati, le Fri 29 Jul 2011 12:43:47 +0200, a écrit :
> I'm so confused. I see couples of cores with the same core id! ( Core#8 for
> example)  How is it possible? 

That's because they are on different sockets. These are physical IDs
(not logical IDs), and are thus not garanteed to be unique.

> 2) logical Core id and Physical core id maybe differents. If i want to be sure
> that id 0 and id 1 are physically near, i have to use core id or PU id? PU ids
> are ever physically near?

Using core or thread ID does not matter. What matters is that you take
the proper ID. Physical IDs will in general never bring you any
proximity indication. What you want is logical IDs, which hwloc takes
care of meaning proximity. Using adjacent logical IDs (be it for core or
threads) will bring you adjacent cores/threads.

> 3) Binding a thread on a core, what's the difference between hwloc_set_cpubind
> () and hwloc_set_thread_cpubind()? More in depth, my code example works well
> with:
> 
> hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);
> 
> and crash with:
> 
> hwloc_set_thread_cpubind(topology, tid, set,  HWLOC_CPUBIND_THREAD);

Note that tid is hwloc_thread_t, i.e. pthread_t on unixes.
It is not a (Linux-specific) tid. If what you have is a (Linux-specific)
tid, use the Linux-specific function, hwloc_linux_set_tid_cpubind.

Samuel


[hwloc-users] Thread core affinity

2011-07-29 Thread Gabriele Fatigati
Dear hwloc users,

I have some questions about thread core affinity managed by hwloc.

1) A simple hwloc-hello.c program in the manual on my machine give me the
follow results:


*** Objects at level 0
Index 0: Machine#0(47GB)
*** Objects at level 1
Index 0: NUMANode#0(24GB)
Index 1: NUMANode#1(24GB)
*** Objects at level 2
Index 0: Socket#0
Index 1: Socket#1
*** Objects at level 3
Index 0: L3(12MB)
Index 1: L3(12MB)
*** Objects at level 4
Index 0: L2(256KB)
Index 1: L2(256KB)
Index 2: L2(256KB)
Index 3: L2(256KB)
Index 4: L2(256KB)
Index 5: L2(256KB)
Index 6: L2(256KB)
Index 7: L2(256KB)
Index 8: L2(256KB)
Index 9: L2(256KB)
Index 10: L2(256KB)
Index 11: L2(256KB)
*** Objects at level 5
Index 0: L1(32KB)
Index 1: L1(32KB)
Index 2: L1(32KB)
Index 3: L1(32KB)
Index 4: L1(32KB)
Index 5: L1(32KB)
Index 6: L1(32KB)
Index 7: L1(32KB)
Index 8: L1(32KB)
Index 9: L1(32KB)
Index 10: L1(32KB)
Index 11: L1(32KB)
*** Objects at level 6
Index 0: Core#0
Index 1: Core#1
Index 2: Core#2
Index 3: Core#8
Index 4: Core#9
Index 5: Core#10
Index 6: Core#0
Index 7: Core#1
Index 8: Core#2
Index 9: Core#8
Index 10: Core#9
Index 11: Core#10
*** Objects at level 7
Index 0: PU#0
Index 1: PU#1
Index 2: PU#2
Index 3: PU#3
Index 4: PU#4
Index 5: PU#5
Index 6: PU#6
Index 7: PU#7
Index 8: PU#8
Index 9: PU#9
Index 10: PU#10
Index 11: PU#11


I'm so confused. I see couples of cores with the same core id! ( Core#8 for
example)  How is it possible?

2) logical Core id and Physical core id maybe differents. If i want to be
sure that id 0 and id 1 are physically near, i have to use core id or PU id?
PU ids are ever physically near?

3) Binding a thread on a core, what's the difference
between hwloc_set_cpubind() and hwloc_set_thread_cpubind()? More in depth,
my code example works well with:

hwloc_set_cpubind(topology, set,  HWLOC_CPUBIND_THREAD);

and crash with:

hwloc_set_thread_cpubind(topology, tid, set,  HWLOC_CPUBIND_THREAD);

Thanks in forward.


-- 
Ing. Gabriele Fatigati

Parallel programmer

CINECA Systems & Tecnologies Department

Supercomputing Group

Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy

www.cineca.itTel:   +39 051 6171722

g.fatigati [AT] cineca.it