IIRC, you prefix the core number with a P to indicate physical

I’ll see what I can do about getting the physical notation re-implemented - 
just can’t promise when that will happen


> On Nov 6, 2014, at 8:30 AM, Tom Wurgler <twu...@goodyear.com> wrote:
> 
> Well, unless we can get LSF to use physical numbering, we are dead in the 
> water without a translator of some sort.
> 
> We are trying to figure how we can automate the translation in the meantime, 
> but we have a mix of clusters and the mapping is different between them.
> 
> We daily use 1.6.4 openmpi (vs all this current testing has been with 1.8.3). 
>  In reading the 1.8.1 man page for mpirun, it states that
> 
> "Starting with Open MPI v1.7, all socket/core slot locations are be specified 
> as logical indexes (the Open MPI v1.6 series used physical indexes)."
> 
> But testing using rankfiles with 1.6.4, it behaves like 1.8.3, ie using 
> logical indexes.  Is there maybe a switch in 1.6.4 to use physical indexes?  
> I am not seeing it in the mpirun --help...
> thanks
> 
> 
> 
> 
> 
> 
> 
> From: devel <devel-boun...@open-mpi.org> on behalf of Ralph Castain 
> <rhc.open...@gmail.com>
> Sent: Thursday, November 6, 2014 11:08 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] mpirun does not honor rankfile
>  
> Ugh….we used to have a switch for that purpose, but it became hard to manage 
> the code. I could reimplement at some point, but it won’t be in the immediate 
> future.
> 
> I gather the issue is that the system tools report physical numbering, and so 
> you have to mentally translate to create the rankfile? Or is there an 
> automated script you run to do the translation?
> 
> In other words, is it possible to simplify the translation in the interim? Or 
> is this a show-stopper for you?
> 
> 
>> On Nov 6, 2014, at 7:21 AM, Tom Wurgler <twu...@goodyear.com 
>> <mailto:twu...@goodyear.com>> wrote:
>> 
>> So we used lstopo with a arg of "--logical" and the output showed the core 
>> numbering 0,1,2,3...47 instead of
>> 0,4,8,12 etc.
>> 
>> The multiplying by 4 you speak of falls apart when you get to the second 
>> socket as its physical numbers are
>> 1,5,9,13... and its logical is 12,13,14,15....
>> 
>> So the question is can we get mpirun to honor the physical numbering?
>> 
>> thanks!
>> tom
>>  
>> From: devel <devel-boun...@open-mpi.org <mailto:devel-boun...@open-mpi.org>> 
>> on behalf of Ralph Castain <rhc.open...@gmail.com 
>> <mailto:rhc.open...@gmail.com>>
>> Sent: Wednesday, November 5, 2014 6:30 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] mpirun does not honor rankfile
>>  
>> I suspect the issue may be with physical vs logical numbering. As I said, we 
>> use logical numbering in the rankfile, not physical. So I’m not entirely 
>> sure how to translate the cpumask in your final table into the numbering 
>> shown in your rankfile listings. Is the cpumask showing a physical core 
>> number?
>> 
>> I ask because it sure looks like the logical numbering we use is getting 
>> multiplied by 4 to become the cpumask you show. If they logically number 
>> their cores by socket (i.e., core 0 is first core in first socket, core 1 is 
>> first core in second socket, etc.), then that would explain the output.
>> 
>> 
>>> On Nov 5, 2014, at 2:23 PM, Tom Wurgler <twu...@goodyear.com 
>>> <mailto:twu...@goodyear.com>> wrote:
>>> 
>>> Well, further investigation found this:
>>> 
>>> If I edit the rank file and change it like this:
>>> 
>>> before:
>>> rank 0=mach1 slot=0
>>> rank 1=mach1 slot=4
>>> rank 2=mach1 slot=8
>>> rank 3=mach1 slot=12
>>> rank 4=mach1 slot=16
>>> rank 5=mach1 slot=20
>>> rank 6=mach1 slot=24
>>> rank 7=mach1 slot=28
>>> rank 8=mach1 slot=32
>>> rank 9=mach1 slot=36
>>> rank 10=mach1 slot=40
>>> rank 11=mach1 slot=44
>>> rank 12=mach1 slot=1
>>> rank 13=mach1 slot=5
>>> rank 14=mach1 slot=9
>>> rank 15=mach1 slot=13
>>> 
>>> after:
>>> rank 0=mach1 slot=0
>>> rank 1=mach1 slot=1
>>> rank 2=mach1 slot=2
>>> rank 3=mach1 slot=3
>>> rank 4=mach1 slot=4
>>> rank 5=mach1 slot=5
>>> rank 6=mach1 slot=6
>>> rank 7=mach1 slot=7
>>> rank 8=mach1 slot=8
>>> rank 9=mach1 slot=9
>>> rank 10=mach1 slot=10
>>> rank 11=mach1 slot=11
>>> rank 12=mach1 slot=12
>>> rank 13=mach1 slot=13
>>> rank 14=mach1 slot=14
>>> rank 15=mach1 slot=15
>>> 
>>> It does what I expect:
>>>   PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     
>>> N4     N5     N6     N7 ]
>>> 12192 my_executable         0                   472.0M [ 472.0M     0      
>>> 0      0      0      0      0      0  ]
>>> 12193 my_executable         4                   358.0M [ 358.0M     0      
>>> 0      0      0      0      0      0  ]
>>> 12194 my_executable         8                   450.4M [ 450.4M     0      
>>> 0      0      0      0      0      0  ]
>>> 12195 my_executable        12                  439.1M [ 439.1M     0      0 
>>>      0      0      0      0      0  ]
>>> 12196 my_executable        16                  392.1M [ 392.1M     0      0 
>>>      0      0      0      0      0  ]
>>> 12197 my_executable        20                  420.6M [ 420.6M     0      0 
>>>      0      0      0      0      0  ]
>>> 12198 my_executable        24                  414.9M [     0  414.9M     0 
>>>      0      0      0      0      0  ]
>>> 12199 my_executable        28                  388.9M [     0  388.9M     0 
>>>      0      0      0      0      0  ]
>>> 12200 my_executable        32                  452.7M [     0  452.7M     0 
>>>      0      0      0      0      0  ]
>>> 12201 my_executable        36                  438.9M [     0  438.9M     0 
>>>      0      0      0      0      0  ]
>>> 12202 my_executable        40                  369.3M [     0  369.3M     0 
>>>      0      0      0      0      0  ]
>>> 12203 my_executable        44                  440.5M [     0  440.5M     0 
>>>      0      0      0      0      0  ]
>>> 12204 my_executable         1                   447.7M [     0      0  
>>> 447.7M     0      0      0      0      0  ]
>>> 12205 my_executable         5                   367.1M [     0      0  
>>> 367.1M     0      0      0      0      0  ]
>>> 12206 my_executable         9                   426.5M [     0      0  
>>> 426.5M     0      0      0      0      0  ]
>>> 12207 my_executable        13                  414.2M [     0      0  
>>> 414.2M     0      0      0      0      0  ]
>>> 
>>> We use hwloc 1.4 to generate a layout of the cores etc.
>>> 
>>> So either LSF created the wrong rankfile (via my config errors, most 
>>> likely) or mpirun can't deal with that rankfile.
>>> 
>>> I can try the nightly tarball as well.  The hardware is 48 core AMD:  4 
>>> sockets, 2 Numa nodes per socket with 6 cores each.
>>> 
>>> thanks
>>> tom 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: devel <devel-boun...@open-mpi.org 
>>> <mailto:devel-boun...@open-mpi.org>> on behalf of Ralph Castain 
>>> <rhc.open...@gmail.com <mailto:rhc.open...@gmail.com>>
>>> Sent: Wednesday, November 5, 2014 4:27 PM
>>> To: Open MPI Developers
>>> Subject: Re: [OMPI devel] mpirun does not honor rankfile
>>>  
>>> Hmmm…well, it seems to be working fine in 1.8.4rc1 (I only have 12 cores on 
>>> my humble machine). However, I can’t test any interactions with LSF, though 
>>> that shouldn’t be an issue:
>>> 
>>> $ mpirun -host bend001 -rf ./rankfile --report-bindings --display-devel-map 
>>> hostname
>>>  Data for JOB [60677,1] offset 0
>>> 
>>>  Mapper requested: NULL  Last mapper: rank_file  Mapping policy: BYUSER  
>>> Ranking policy: SLOT
>>>  Binding policy: CPUSET  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
>>>   Num new daemons: 0
>>> New daemon starting vpid INVALID
>>>   Num nodes: 1
>>> 
>>>  Data for node: bend001    Launch id: -1
>>> State: 2
>>>   Daemon: [[60677,0],0]
>>> Daemon launched: True
>>>   Num slots: 12
>>> Slots in use: 12    Oversubscribed: FALSE
>>>   Num slots allocated: 12
>>> Max slots: 0
>>>   Username on node: NULL
>>>   Num procs: 12
>>> Next node_rank: 12
>>>   Data for proc: [[60677,1],0]
>>>   Pid: 0
>>> Local rank: 0    Node rank: 0
>>> App rank: 0
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 0,12
>>>   Data for proc: [[60677,1],1]
>>>   Pid: 0
>>> Local rank: 1    Node rank: 1
>>> App rank: 1
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 8,20
>>>   Data for proc: [[60677,1],2]
>>>   Pid: 0
>>> Local rank: 2    Node rank: 2
>>> App rank: 2
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 5,17
>>>   Data for proc: [[60677,1],3]
>>>   Pid: 0
>>> Local rank: 3    Node rank: 3
>>> App rank: 3
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 9,21
>>>   Data for proc: [[60677,1],4]
>>>   Pid: 0
>>> Local rank: 4    Node rank: 4
>>> App rank: 4
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 11,23
>>>   Data for proc: [[60677,1],5]
>>>   Pid: 0
>>> Local rank: 5    Node rank: 5
>>> App rank: 5
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 7,19
>>>   Data for proc: [[60677,1],6]
>>>   Pid: 0
>>> Local rank: 6    Node rank: 6
>>> App rank: 6
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 3,15
>>>   Data for proc: [[60677,1],7]
>>>   Pid: 0
>>> Local rank: 7    Node rank: 7
>>> App rank: 7
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 6,18
>>>   Data for proc: [[60677,1],8]
>>>   Pid: 0
>>> Local rank: 8    Node rank: 8
>>> App rank: 8
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 2,14
>>>   Data for proc: [[60677,1],9]
>>>   Pid: 0
>>> Local rank: 9    Node rank: 9
>>> App rank: 9
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 4,16
>>>   Data for proc: [[60677,1],10]
>>>   Pid: 0
>>> Local rank: 10    Node rank: 10
>>> App rank: 10
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 10,22
>>>   Data for proc: [[60677,1],11]
>>>   Pid: 0
>>> Local rank: 11    Node rank: 11
>>> App rank: 11
>>>   State: INITIALIZED
>>> Restarts: 0 App_context: 0
>>> Locale: UNKNOWN    Bind location: (null)
>>>  Binding: 1,13
>>> [bend001:24667] MCW rank 1 bound to socket 0[core 4[hwt 0-1]]: 
>>> [../../../../BB/..][../../../../../..]
>>> [bend001:24667] MCW rank 2 bound to socket 1[core 8[hwt 0-1]]: 
>>> [../../../../../..][../../BB/../../..]
>>> [bend001:24667] MCW rank 3 bound to socket 1[core 10[hwt 0-1]]: 
>>> [../../../../../..][../../../../BB/..]
>>> [bend001:24667] MCW rank 4 bound to socket 1[core 11[hwt 0-1]]: 
>>> [../../../../../..][../../../../../BB]
>>> [bend001:24667] MCW rank 5 bound to socket 1[core 9[hwt 0-1]]: 
>>> [../../../../../..][../../../BB/../..]
>>> [bend001:24667] MCW rank 6 bound to socket 1[core 7[hwt 0-1]]: 
>>> [../../../../../..][../BB/../../../..]
>>> [bend001:24667] MCW rank 7 bound to socket 0[core 3[hwt 0-1]]: 
>>> [../../../BB/../..][../../../../../..]
>>> [bend001:24667] MCW rank 8 bound to socket 0[core 1[hwt 0-1]]: 
>>> [../BB/../../../..][../../../../../..]
>>> [bend001:24667] MCW rank 9 bound to socket 0[core 2[hwt 0-1]]: 
>>> [../../BB/../../..][../../../../../..]
>>> [bend001:24667] MCW rank 10 bound to socket 0[core 5[hwt 0-1]]: 
>>> [../../../../../BB][../../../../../..]
>>> [bend001:24667] MCW rank 11 bound to socket 1[core 6[hwt 0-1]]: 
>>> [../../../../../..][BB/../../../../..]
>>> [bend001:24667] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: 
>>> [BB/../../../../..][../../../../../..]
>>> 
>>> Can you try with the latest nightly 1.8 tarball?
>>> 
>>> http://www.open-mpi.org/nightly/v1.8/ 
>>> <http://www.open-mpi.org/nightly/v1.8/>
>>> 
>>> Note that it is also possible that hwloc isn’t correctly identifying the 
>>> cores here. Can you tell us something about the hardware? Do you have 
>>> hardware threads enabled?
>>> 
>>> I ask because the binding being reported by us is the cpu numbers as 
>>> identified by hwloc - which may not be the same you are expecting from some 
>>> hardware vendor’s map. We are using logical processor assignments, not 
>>> physical. You can use the —report-bindings option to show the resulting 
>>> map, as above.
>>> 
>>> 
>>> 
>>>> On Nov 5, 2014, at 7:21 AM, twu...@goodyear.com 
>>>> <mailto:twu...@goodyear.com> wrote:
>>>> 
>>>> I am using openmpi v 1.8.3 and LSF 9.1.3.
>>>> 
>>>> LSF creates a rankfile that looks like:
>>>> 
>>>> RANK_FILE:
>>>> ======================================================================
>>>> rank 0=mach1 slot=0
>>>> rank 1=mach1 slot=4
>>>> rank 2=mach1 slot=8
>>>> rank 3=mach1 slot=12
>>>> rank 4=mach1 slot=16
>>>> rank 5=mach1 slot=20
>>>> rank 6=mach1 slot=24
>>>> rank 7=mach1 slot=28
>>>> rank 8=mach1 slot=32
>>>> rank 9=mach1 slot=36
>>>> rank 10=mach1 slot=40
>>>> rank 11=mach1 slot=44
>>>> rank 12=mach1 slot=1
>>>> rank 13=mach1 slot=5
>>>> rank 14=mach1 slot=9
>>>> rank 15=mach1 slot=13
>>>> 
>>>> which really are the cores I want to use, in order. 
>>>> 
>>>> I logon to this machine and type (all on one line):
>>>> 
>>>> /apps/share/openmpi/1.8.3.I1217913/bin/mpirun \
>>>>  --mca orte_base_help_aggregate 0 \
>>>>  -v -display-devel-allocation \
>>>>  -display-devel-map \
>>>>  --rankfile RANK_FILE \
>>>>  --mca btl openib,tcp,sm,self \
>>>>  --x LD_LIBRARY_PATH \
>>>>  --np 16 \
>>>>  my_executable \
>>>>  -i model.i \
>>>>  -l model.o
>>>> 
>>>> And I get the following on the screen:
>>>> 
>>>> ======================   ALLOCATED NODES   ======================
>>>> mach1: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>> =================================================================
>>>> Data for JOB [52387,1] offset 0
>>>> 
>>>> Mapper requested: NULL  Last mapper: rank_file  Mapping policy: BYUSER  
>>>> Ranking policy: SLOT
>>>> Binding policy: CPUSET  Cpu set: NULL  PPR: NULL  Cpus-per-rank: 1
>>>> Num new daemons: 0
>>>> New daemon starting vpid INVALID
>>>> Num nodes: 1
>>>> 
>>>> Data for node: mach1    Launch id: -1
>>>> State: 2
>>>> Daemon: [[52387,0],0]
>>>> Daemon launched: True
>>>> Num slots: 16
>>>> Slots in use: 16    Oversubscribed: FALSE
>>>> Num slots allocated: 16
>>>> Max slots: 0
>>>> Username on node: NULL
>>>> Num procs: 16
>>>> Next node_rank: 16
>>>> Data for proc: [[52387,1],0]
>>>> Pid: 0
>>>> Local rank: 0    Node rank: 0
>>>> App rank: 0
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 0
>>>> Data for proc: [[52387,1],1]
>>>> Pid: 0
>>>> Local rank: 1    Node rank: 1
>>>> App rank: 1
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 16
>>>> Data for proc: [[52387,1],2]
>>>> Pid: 0
>>>> Local rank: 2    Node rank: 2
>>>> App rank: 2
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 32
>>>> Data for proc: [[52387,1],3]
>>>> Pid: 0
>>>> Local rank: 3    Node rank: 3
>>>> App rank: 3
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 1
>>>> Data for proc: [[52387,1],4]
>>>> Pid: 0
>>>> Local rank: 4    Node rank: 4
>>>> App rank: 4
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 17
>>>> Data for proc: [[52387,1],5]
>>>> Pid: 0
>>>> Local rank: 5    Node rank: 5
>>>> App rank: 5
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 33
>>>> Data for proc: [[52387,1],6]
>>>> Pid: 0
>>>> Local rank: 6    Node rank: 6
>>>> App rank: 6
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 2
>>>> Data for proc: [[52387,1],7]
>>>> Pid: 0
>>>> Local rank: 7    Node rank: 7
>>>> App rank: 7
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 18
>>>> Data for proc: [[52387,1],8]
>>>> Pid: 0
>>>> Local rank: 8    Node rank: 8
>>>> App rank: 8
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 34
>>>> Data for proc: [[52387,1],9]
>>>> Pid: 0
>>>> Local rank: 9    Node rank: 9
>>>> App rank: 9
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 3
>>>> Data for proc: [[52387,1],10]
>>>> Pid: 0
>>>> Local rank: 10    Node rank: 10
>>>> App rank: 10
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 19
>>>> Data for proc: [[52387,1],11]
>>>> Pid: 0
>>>> Local rank: 11    Node rank: 11
>>>> App rank: 11
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 35
>>>> Data for proc: [[52387,1],12]
>>>> Pid: 0
>>>> Local rank: 12    Node rank: 12
>>>> App rank: 12
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 4
>>>> Data for proc: [[52387,1],13]
>>>> Pid: 0
>>>> Local rank: 13    Node rank: 13
>>>> App rank: 13
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 20
>>>> Data for proc: [[52387,1],14]
>>>> Pid: 0
>>>> Local rank: 14    Node rank: 14
>>>> App rank: 14
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 36
>>>> Data for proc: [[52387,1],15]
>>>> Pid: 0
>>>> Local rank: 15    Node rank: 15
>>>> App rank: 15
>>>> State: INITIALIZED
>>>> Restarts: 0 App_context: 0
>>>> Locale: UNKNOWN    Bind location: (null)
>>>>  Binding: 5
>>>> 
>>>> And a numa-map of the node shows:
>>>> 
>>>>  PID COMMAND         CPUMASK     TOTAL [     N0     N1     N2     N3     
>>>> N4     N5     N6     N7 ]
>>>> 31044 my_executable         0    443.3M [ 443.3M     0      0      0      
>>>> 0      0      0      0  ]
>>>> 31045 my_executable        16    459.7M [ 459.7M     0      0      0      
>>>> 0      0      0      0  ]
>>>> 31046 my_executable        32    435.0M [     0  435.0M     0      0      
>>>> 0      0      0      0  ]
>>>> 31047 my_executable         1    468.8M [     0      0  468.8M     0      
>>>> 0      0      0      0  ]
>>>> 31048 my_executable        17    493.2M [     0      0  493.2M     0      
>>>> 0      0      0      0  ]
>>>> 31049 my_executable        33    498.0M [     0      0      0  498.0M     
>>>> 0      0      0      0  ]
>>>> 31050 my_executable         2    501.2M [     0      0      0      0  
>>>> 501.2M     0      0      0  ]
>>>> 31051 my_executable        18    502.4M [     0      0      0      0  
>>>> 502.4M     0      0      0  ]
>>>> 31052 my_executable        34    500.5M [     0      0      0      0      
>>>> 0  500.5M     0      0  ]
>>>> 31053 my_executable         3    515.6M [     0      0      0      0      
>>>> 0      0  515.6M     0  ]
>>>> 31054 my_executable        19    508.1M [     0      0      0      0      
>>>> 0      0  508.1M     0  ]
>>>> 31055 my_executable        35    503.9M [     0      0      0      0      
>>>> 0      0      0  503.9M ]
>>>> 31056 my_executable         4    502.1M [ 502.1M     0      0      0      
>>>> 0      0      0      0  ]
>>>> 31057 my_executable        20    515.2M [ 515.2M     0      0      0      
>>>> 0      0      0      0  ]
>>>> 31058 my_executable        36    508.1M [     0  508.1M     0      0      
>>>> 0      0      0      0  ]
>>>> 31059 my_executable         5    446.7M [     0      0  446.7M     0      
>>>> 0      0      0      0  ]
>>>> -- 
>>>> 
>>>> Why didn't mpirun honor the ranfile and put the processes on the correct 
>>>> cores in
>>>> the proper order?  It looks to me like mpirun doesn't like the 
>>>> rankfile...??
>>>> 
>>>> Thanks for any help.
>>>> 
>>>> Tom
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16199.php 
>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16199.php>
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/11/16221.php 
>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16221.php>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/11/16229.php 
>> <http://www.open-mpi.org/community/lists/devel/2014/11/16229.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16233.php 
> <http://www.open-mpi.org/community/lists/devel/2014/11/16233.php>

Reply via email to