Re: [hwloc-users] Hardware Topology

2012-02-21 Thread Samuel Thibault
vaibhav dutt, le Tue 21 Feb 2012 19:59:54 +0100, a écrit :
> The following is the Hardware topology of the compute node I am using, a
> obtained
> by using lstopo.
> 
> Machine (16GB)
>   Socket L#0
>     L2 L#0 (6144KB)
>   L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>   L1 L#1 (32KB) + Core L#1 + PU L#1 (P#4)
>     L2 L#1 (6144KB)
>   L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>   L1 L#3 (32KB) + Core L#3 + PU L#3 (P#6)
>   Socket L#1
>     L2 L#2 (6144KB)
>   L1 L#4 (32KB) + Core L#4 + PU L#4 (P#1)
>   L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>     L2 L#3 (6144KB)
>   L1 L#6 (32KB) + Core L#6 + PU L#6 (P#3)
>   L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
> 
> It has 4 cores on each socket. But the cores like(0 and 4, 1 and 5 etc.)
> are to be considered on the same die?

0 and 4 share the same L2 cache, and are on the same socket as 2 and 6.
Use lstopo -.txt, it'll probably be clearer.

Samuel


Re: [hwloc-users] Hardware Topology

2012-02-21 Thread vaibhav dutt
Hi,


> The following is the Hardware topology of the compute node I am using, a
> obtained
> by using lstopo.
>
> Machine (16GB)
>   Socket L#0
> L2 L#0 (6144KB)
>   L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>   L1 L#1 (32KB) + Core L#1 + PU L#1 (P#4)
> L2 L#1 (6144KB)
>   L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>   L1 L#3 (32KB) + Core L#3 + PU L#3 (P#6)
>   Socket L#1
> L2 L#2 (6144KB)
>   L1 L#4 (32KB) + Core L#4 + PU L#4 (P#1)
>   L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
> L2 L#3 (6144KB)
>   L1 L#6 (32KB) + Core L#6 + PU L#6 (P#3)
>   L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>
> It has 4 cores on each socket. But the cores like(0 and 4, 1 and 5 etc.)
> are to be considered on the same die?
>
> Thanks
>


Re: [hwloc-users] receive 0x0 from hwloc_cuda_get_device_cpuset

2012-02-21 Thread Brice Goglin
Le 21/02/2012 15:42, Albert Solernou a écrit :
> Hi,
> I have several questions in order to fix this issue from the machine
> side.
>
> 1) I realised that on this machine neither libcpuset nor cpuset-utils
> are installed. Could this be related to the problem?

No, Linux "cpuset" are very different from hwloc "cpuset" and "bitmap"
unfortunately. The former is about reducing the available resources in
the machine so that processes cannot use the entire CPUs for instance.
hwloc detects this feature but it doesn't need libcpuset to do so.
Things just work :)

> 2) Could you specify any BIOS parameter we could tune up

You can look for PCI affinity or PCI NUMA maybe. But I don't think
you'll find anything because your machine isn't NUMA anyway. I/O
affinity don't matter on this machine, there's no reason to
enable/disable it in this BIOS.

> 3) Could this issue be related to the linux kernel?

I think the kernel has been properly detecting this kind of affinity
from the BIOS for a very long time. At least 2.6.18 but likely way earlier.

You should just forget about this problem and use hwloc 1.4.1rc1
(released today, already on the web, to be announced soon, once windows
zips are ready). It contains the workaround for your problem.

Brice



Re: [hwloc-users] receive 0x0 from hwloc_cuda_get_device_cpuset

2012-02-21 Thread Albert Solernou

Hi,
I have several questions in order to fix this issue from the machine side.

1) I realised that on this machine neither libcpuset nor cpuset-utils 
are installed. Could this be related to the problem?


2) Could you specify any BIOS parameter we could tune up

3) Could this issue be related to the linux kernel?

Best,
Albert

On 16/02/12 14:29, Brice Goglin wrote:

Le 16/02/2012 15:26, Albert Solernou a écrit :


Is there anything easy that the administrators of the cluster could
do? How could I persuade them that this is an easy task to do?

They could upgrade the BIOS. But your machine is old and people didn't
care much about I/O affinity in Intel servers at this time, so I don't
expect a newer BIOS to improve things much.

My patch should likely be applied to hwloc anyway, this is an easy
workaround.

Brice




Re: [hwloc-users] bind process to built cpuset

2012-02-21 Thread Brice Goglin
void hwloc_bitmap_or (hwloc_bitmap_t res, hwloc_const_bitmap_t bitmap1,
hwloc_const_bitmap_t bitmap2);

The first argument is the destination, it's not const. Only the source
arguments (second and third) are const.

Brice




Le 21/02/2012 12:18, Albert Solernou a écrit :
> Hi, I just tried it and it works nicely!
>
> I didn't tried it myself because the documentation of the library
> states that two of the arguments in hwloc_bitmap_or are
> hwloc_const_bitmap_t. However, in your approach only one of  them is
> constant. Anyway, it is working now.
>
> Best,
> Albert
>
> On Tue 21 Feb 2012 09:46:46 GMT, Albert Solernou wrote:
>> Thank you very much, Brice!
>>
>> Best,
>> Albert
>>
>> On Mon 20 Feb 2012 18:09:55 GMT, Brice Goglin wrote:
>>> Le 20/02/2012 19:06, Brice Goglin a écrit :
 Le 20/02/2012 17:41, Albert Solernou a écrit :
> Hi,
> I'd like to bind a process to a cpuset, so that when it spawns on
> several threads, those are trapped on that cpuset.
>
> In order to do so, I want to define my own cpuset. Let's say I
> want it
> to include HWLOC_OBJ_CORE 2 and 5. How can I create this cpuset? The
> bitmap api sounds like the solution to me, but I couldn't relate the
> indexes in there into HWLOC_OBJects of any type...
 If you want to bind to cores #2 and #5, do:

 hwloc_bitmap_t cpuset;
 hwloc_obj_t core1, core2;

 core1 = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, 2);
 if (!core1)
  error...
 core2 = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, 5);
 if (!core2)
  error...
 cpuset = hwloc_bitmap_alloc();
 if (!cpuset);
  error...
 hwloc_bitmap_or(cpuset, cpuset, core1->cpuset);
 hwloc_bitmap_or(cpuset, cpuset, core2->cpuset);
>>>
>>> By the way, alloc()+or() can be optimized as dup():
>>>
>>> cpuset = hwloc_bitmap_dup(core1->cpuset);
>>> if (!cpuset)
>>>  error...
>>> hwloc_bitmap_or(cpuset, cpuset, core2->cpuset);
>>>
>>> Brice
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
>> ___
>> hwloc-users mailing list
>> hwloc-us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
> ___
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users



Re: [hwloc-users] bind process to built cpuset

2012-02-21 Thread Albert Solernou
Hi, 
I just tried it and it works nicely!


I didn't tried it myself because the documentation of the library 
states that two of the arguments in hwloc_bitmap_or are 
hwloc_const_bitmap_t. However, in your approach only one of  them is 
constant. Anyway, it is working now.


Best,
Albert

On Tue 21 Feb 2012 09:46:46 GMT, Albert Solernou wrote:

Thank you very much, Brice!

Best,
Albert

On Mon 20 Feb 2012 18:09:55 GMT, Brice Goglin wrote:

Le 20/02/2012 19:06, Brice Goglin a écrit :

Le 20/02/2012 17:41, Albert Solernou a écrit :

Hi,
I'd like to bind a process to a cpuset, so that when it spawns on
several threads, those are trapped on that cpuset.

In order to do so, I want to define my own cpuset. Let's say I want it
to include HWLOC_OBJ_CORE 2 and 5. How can I create this cpuset? The
bitmap api sounds like the solution to me, but I couldn't relate the
indexes in there into HWLOC_OBJects of any type...

If you want to bind to cores #2 and #5, do:

hwloc_bitmap_t cpuset;
hwloc_obj_t core1, core2;

core1 = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, 2);
if (!core1)
 error...
core2 = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, 5);
if (!core2)
 error...
cpuset = hwloc_bitmap_alloc();
if (!cpuset);
 error...
hwloc_bitmap_or(cpuset, cpuset, core1->cpuset);
hwloc_bitmap_or(cpuset, cpuset, core2->cpuset);


By the way, alloc()+or() can be optimized as dup():

cpuset = hwloc_bitmap_dup(core1->cpuset);
if (!cpuset)
 error...
hwloc_bitmap_or(cpuset, cpuset, core2->cpuset);

Brice

___
hwloc-users mailing list
hwloc-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

___
hwloc-users mailing list
hwloc-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users


Re: [hwloc-users] bind process to built cpuset

2012-02-21 Thread Albert Solernou

Thank you very much, Brice!

Best,
Albert

On Mon 20 Feb 2012 18:09:55 GMT, Brice Goglin wrote:

Le 20/02/2012 19:06, Brice Goglin a écrit :

Le 20/02/2012 17:41, Albert Solernou a écrit :

Hi,
I'd like to bind a process to a cpuset, so that when it spawns on
several threads, those are trapped on that cpuset.

In order to do so, I want to define my own cpuset. Let's say I want it
to include HWLOC_OBJ_CORE 2 and 5. How can I create this cpuset? The
bitmap api sounds like the solution to me, but I couldn't relate the
indexes in there into HWLOC_OBJects of any type...

If you want to bind to cores #2 and #5, do:

hwloc_bitmap_t cpuset;
hwloc_obj_t core1, core2;

core1 = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, 2);
if (!core1)
error...
core2 = hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, 5);
if (!core2)
error...
cpuset = hwloc_bitmap_alloc();
if (!cpuset);
error...
hwloc_bitmap_or(cpuset, cpuset, core1->cpuset);
hwloc_bitmap_or(cpuset, cpuset, core2->cpuset);


By the way, alloc()+or() can be optimized as dup():

cpuset = hwloc_bitmap_dup(core1->cpuset);
if (!cpuset)
error...
hwloc_bitmap_or(cpuset, cpuset, core2->cpuset);

Brice

___
hwloc-users mailing list
hwloc-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users