Hello Samuel,

thanx for the hint ... now I start my program with:

  hwloc_topology_init(&topology);
  hwloc_topology_set_flags(topology,HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM);
  hwloc_topology_load(topology);

and can access all information I need to rebind my MPI-tasks or to rearrange the MPI communicators.

btw: are there any plans to fully support POWER6 and/or POWER7 running AIX6.1 for the future? Actually we can get the topology right but cache sizes are missing.

Hendryk

On 10/02/11 10:40, Samuel Thibault wrote:
Hello,

Hendryk Bockelmann, le Thu 10 Feb 2011 09:08:11 +0100, a écrit :
On our clusters the job scheduler binds the MPI tasks, but it is not
always clear to which resources. So for us it would be great to know
where a task runs such that we might adopt the MPI communicators to
increase performance.

Ok, so get_cpubind should be enough to know what binding the job
scheduler does.

Maybe just a note on the hwloc output on the cluster: while on my locale
machine all MPI tasks are able to explore the whole topology, on the
cluster each task only sees itself, e.g. for task 7:

7:Machine#0(Backend=AIXOSName=AIXOSRelease=1OSVersion=6HostName=p191Architecture=00C83AC24C00),
cpuset: 0x0000c000
7:  NUMANode#0, cpuset: 0x0000c000
7:    L2Cache#0(0KB line=0), cpuset: 0x0000c000
7:      Core#0, cpuset: 0x0000c000
7:        PU, cpuset: 0x00004000
7:        PU#0, cpuset: 0x00008000
7:-->  root_cpuset of process 7 is 0x0000c000

Yes, because by default hwloc restricts itself to what you are allowed
to use anyway. To see more, use --whole-system.

Nevertheless, all MPI-tasks have different cpusets and since the nodes
are homogeneous one can guess the whole binding using the information of
lstopo and the HostName of each task. Perhaps you can tell me whether
such a restricted topology is due to hwloc or due to the fixed binding
by the job scheduler?

It's because by default hwloc follows the fixed binding :)

Samuel

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to