I made a mistake in the previous reply. You can use two ways here like: rank 0=host1 slot=0 rank 1=host1 slot=2 rank 2=host1 slot=4 rank 3=host1 slot=6 rank 4=host1 slot=1 rank 5=host1 slot=3 rank 6=host1 slot=5 rank 7=host1 slot=7
or rank 0=host1 slot=0:0 rank 1=host1 slot=0:1 rank 2=host1 slot=0:2 rank 3=host1 slot=0:3 rank 4=host1 slot=1:0 rank 5=host1 slot=1:1 rank 6=host1 slot=1:2 rank 7=host1 slot=1:3 Teng On Thu, Feb 2, 2012 at 12:17 PM, teng ma <[email protected]> wrote: > Just remove p in your rankfile like > > rank 0=host1 slot=0:0 > rank 1=host1 slot=0:2 > rank 2=host1 slot=0:4 > rank 3=host1 slot=0:6 > rank 4=host1 slot=1:1 > rank 5=host1 slot=1:3 > rank 6=host1 slot=1:5 > rank 7=host1 slot=1:7 > > Teng > > 2012/2/2 François Tessier <[email protected]> > >> Hello, >> >> I need to use a rankfile with openMPI 1.5.4 to do some tests on a basic >> architecture. I'm using a node for which lstopo returns that : >> >> ---------------- >> Machine (24GB) >> NUMANode L#0 (P#0 12GB) >> Socket L#0 + L3 L#0 (8192KB) >> L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0) >> L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#2) >> L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#4) >> L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#6) >> HostBridge L#0 >> PCIBridge >> PCI 8086:10c9 >> Net L#0 "eth0" >> PCI 8086:10c9 >> Net L#1 "eth1" >> PCIBridge >> PCI 15b3:673c >> Net L#2 "ib0" >> Net L#3 "ib1" >> OpenFabrics L#4 "mlx4_0" >> PCIBridge >> PCI 102b:0522 >> PCI 8086:3a22 >> Block L#5 "sda" >> Block L#6 "sdb" >> Block L#7 "sdc" >> Block L#8 "sdd" >> NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (8192KB) >> L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#1) >> L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#3) >> L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#5) >> L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7) >> ---------------- >> >> And I would like to use the physical numbering. To do that, I created a >> rankfile like this : >> >> rank 0=host1 slot=p0:0 >> rank 1=host1 slot=p0:2 >> rank 2=host1 slot=p0:4 >> rank 3=host1 slot=p0:6 >> rank 4=host1 slot=p1:1 >> rank 5=host1 slot=p1:3 >> rank 6=host1 slot=p1:5 >> rank 7=host1 slot=p1:7 >> >> But when I run my job with "*mpiexec -np 8 --rankfile rankfile ./foo*", >> I encounter this error : >> >> * Specified slot list: p0:4 >> Error: Not found >> >> This could mean that a non-existent processor was specified, or >> that the specification had improper syntax.* >> >> >> Do you know what I did wrong? >> >> Best regards, >> >> François >> >> -- >> ___________________ >> François TESSIER >> PhD Student at University of Bordeaux >> Tel : [email protected] >> >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > | Teng Ma Univ. of Tennessee | > | [email protected] Knoxville, TN | > | http://web.eecs.utk.edu/~tma/ | > -- | Teng Ma Univ. of Tennessee | | [email protected] Knoxville, TN | | http://web.eecs.utk.edu/~tma/ |
