Hi, Brice.

I installed the latest hwloc-1.4.1.
Here is the output of lstopo -p.

[root@node03 bin]# ./lstopo -p
Machine (126GB)
  Socket P#0 (32GB)
    NUMANode P#0 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#12
    NUMANode P#1 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#16
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#20
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#24
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#28
  Socket P#3 (32GB)
    NUMANode P#6 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#1
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#5
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#9
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#13
    NUMANode P#7 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#17
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#21
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#25
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#29
  Socket P#2 (32GB)
    NUMANode P#4 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#2
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#6
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#10
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#14
    NUMANode P#5 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#18
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#22
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#26
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#30
  Socket P#1 (32GB)
    NUMANode P#2 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#3
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#7
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#11
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#15
    NUMANode P#3 (16GB) + L3 (5118KB)
      L2 (512KB) + L1 (64KB) + Core P#0 + PU P#19
      L2 (512KB) + L1 (64KB) + Core P#1 + PU P#23
      L2 (512KB) + L1 (64KB) + Core P#2 + PU P#27
      L2 (512KB) + L1 (64KB) + Core P#3 + PU P#31
  HostBridge P#0
    PCIBridge
      PCI 14e4:1639
        Net "eth0"
      PCI 14e4:1639
        Net "eth1"
    PCIBridge
      PCI 14e4:1639
        Net "eth2"
      PCI 14e4:1639
        Net "eth3"
    PCIBridge
      PCIBridge
        PCIBridge
          PCI 1000:0072
            Block "sdb"
            Block "sda"
    PCI 1002:4390
      Block "sr0"
    PCIBridge
      PCI 102b:0532
  HostBridge P#1
    PCIBridge
      PCI 15b3:6274
        Net "ib0"
        OpenFabrics "mthca0"

Tetsuya Mishima

> Can you send the output of lstopo -p ? (you'll have to install hwloc)
> Brice
>
>
> tmish...@jcity.maeda.co.jp a écrit :
>
>
> Hi,
>
> I updated openmpi from version 1.5.4 to 1.5.5.
> Then, an execution speed of my application becomes quite slower than
> before,
> due to wrong core bindings. As far as I checked, it seems that
> openmpi-1.5.4
> gives correct core bindings for my magnycore based machine.
>
> 1) my script is as follows:
> export OMP_NUM_THREADS=4
> mpirun -machinefile pbs_hosts \
> -np 8 \
> -x OMP_NUM_THREADS \
> -bind-to-core \
> -cpus-per-proc ${OMP_NUM_THREADS} \
> -report-bindings \
> ./Solver
>
> 2)binding reports are as follows:
> openmpi-1.5.4:
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],3] to cpus 22220000
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],4] to cpus 4444
> [node03.cluster:21706] [[55518,0],0]
> odls:default:fork binding child
> [[55518,1],5] to cpus 44440000
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],6] to cpus 8888
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],7] to cpus 88880000
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],0] to cpus 1111
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],1] to cpus 11110000
> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child
> [[55518,1],2] to cpus 2222
> openmpi-1.5.5:
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],3] to cpus f000
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],4] to cpus 000f0000
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],5] to cpus 00f00000
> [node03.cluster:04706] [[40566,0],0]
> odls:default:fork binding child
> [[40566,1],6] to cpus 0f000000
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],7] to cpus f0000000
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],0] to cpus 000f
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],1] to cpus 00f0
> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child
> [[40566,1],2] to cpus 0f00
>
> 3)node03 has 32 cores with 4 magnycores(8cores/cpu-type).
>
> Regards,
> Tetsuya Mishima
>
>
>
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to