Hooray!

On Dec 19, 2013, at 10:14 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph,
> 
> Thank you for your fix. It works for me.
> 
> Tetsuya Mishima
> 
> 
>> Actually, it looks like it would happen with hetero-nodes set - only
> required that at least two nodes have the same architecture. So you might
> want to give the trunk a shot as it may well now be
>> fixed.
>> 
>> 
>> On Dec 19, 2013, at 8:35 AM, Ralph Castain <r...@open-mpi.org> wrote:
>> 
>>> Hmmm...not having any luck tracking this down yet. If anything, based
> on what I saw in the code, I would have expected it to fail when
> hetero-nodes was false, not the other way around.
>>> 
>>> I'll keep poking around - just wanted to provide an update.
>>> 
>>> On Dec 19, 2013, at 12:54 AM, tmish...@jcity.maeda.co.jp wrote:
>>> 
>>>> 
>>>> 
>>>> Hi Ralph, sorry for intersecting post.
>>>> 
>>>> Your advice about -hetero-nodes in other thread gives me a hint.
>>>> 
>>>> I already put "orte_hetero_nodes = 1" in my mca-params.conf, because
>>>> you told me a month ago that my environment would need this option.
>>>> 
>>>> Removing this line from mca-params.conf, then it works.
>>>> In other word, you can replicate it by adding -hetero-nodes as
>>>> shown below.
>>>> 
>>>> qsub: job 8364.manage.cluster completed
>>>> [mishima@manage mpi]$ qsub -I -l nodes=2:ppn=8
>>>> qsub: waiting for job 8365.manage.cluster to start
>>>> qsub: job 8365.manage.cluster ready
>>>> 
>>>> [mishima@node11 ~]$ ompi_info --all | grep orte_hetero_nodes
>>>>              MCA orte: parameter "orte_hetero_nodes" (current value:
>>>> "false", data source: default, level: 9 dev/all,
>>>> type: bool)
>>>> [mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>> [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>>> myprog
>>>> [node11.cluster:27895] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket
>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>> [node11.cluster:27895] MCW rank 1 bound to socket 1[core 4[hwt 0]],
> socket
>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>> [node12.cluster:24891] MCW rank 3 bound to socket 1[core 4[hwt 0]],
> socket
>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>> [node12.cluster:24891] MCW rank 2 bound to socket 0[core 0[hwt 0]],
> socket
>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>> Hello world from process 0 of 4
>>>> Hello world from process 1 of 4
>>>> Hello world from process 2 of 4
>>>> Hello world from process 3 of 4
>>>> [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>>> -hetero-nodes myprog
>>>> 
> --------------------------------------------------------------------------
>>>> A request was made to bind to that would result in binding more
>>>> processes than cpus on a resource:
>>>> 
>>>> Bind to:         CORE
>>>> Node:            node12
>>>> #processes:  2
>>>> #cpus:          1
>>>> 
>>>> You can override this protection by adding the "overload-allowed"
>>>> option to your binding directive.
>>>> 
> --------------------------------------------------------------------------
>>>> 
>>>> 
>>>> As far as I checked, data->num_bound seems to become bad in
> bind_downwards,
>>>> when I put "-hetero-nodes". I hope you can clear the problem.
>>>> 
>>>> Regards,
>>>> Tetsuya Mishima
>>>> 
>>>> 
>>>>> Yes, it's very strange. But I don't think there's any chance that
>>>>> I have < 8 actual cores on the node. I guess that you cat replicate
>>>>> it with SLURM, please try it again.
>>>>> 
>>>>> I changed to use node10 and node11, then I got the warning against
>>>>> node11.
>>>>> 
>>>>> Furthermore, just as an information for you, I tried to add
>>>>> "-bind-to core:overload-allowed", then it worked as shown below.
>>>>> But I think node11 is never overloaded because it has 8 cores.
>>>>> 
>>>>> qsub: job 8342.manage.cluster completed
>>>>> [mishima@manage ~]$ qsub -I -l nodes=node10:ppn=8+node11:ppn=8
>>>>> qsub: waiting for job 8343.manage.cluster to start
>>>>> qsub: job 8343.manage.cluster ready
>>>>> 
>>>>> [mishima@node10 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>> [mishima@node10 demos]$ cat $PBS_NODEFILE
>>>>> node10
>>>>> node10
>>>>> node10
>>>>> node10
>>>>> node10
>>>>> node10
>>>>> node10
>>>>> node10
>>>>> node11
>>>>> node11
>>>>> node11
>>>>> node11
>>>>> node11
>>>>> node11
>>>>> node11
>>>>> node11
>>>>> [mishima@node10 demos]$ mpirun -np 4 -cpus-per-proc 4
> -report-bindings
>>>>> myprog
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>> A request was made to bind to that would result in binding more
>>>>> processes than cpus on a resource:
>>>>> 
>>>>> Bind to:         CORE
>>>>> Node:            node11
>>>>> #processes:  2
>>>>> #cpus:          1
>>>>> 
>>>>> You can override this protection by adding the "overload-allowed"
>>>>> option to your binding directive.
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>> [mishima@node10 demos]$ mpirun -np 4 -cpus-per-proc 4
> -report-bindings
>>>>> -bind-to core:overload-allowed myprog
>>>>> [node10.cluster:27020] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>>>> socket
>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>>> [node10.cluster:27020] MCW rank 1 bound to socket 1[core 4[hwt 0]],
>>>> socket
>>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>>> [node11.cluster:26597] MCW rank 3 bound to socket 1[core 4[hwt 0]],
>>>> socket
>>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>>> [node11.cluster:26597] MCW rank 2 bound to socket 0[core 0[hwt 0]],
>>>> socket
>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>>> Hello world from process 1 of 4
>>>>> Hello world from process 0 of 4
>>>>> Hello world from process 3 of 4
>>>>> Hello world from process 2 of 4
>>>>> 
>>>>> Regards,
>>>>> Tetsuya Mishima
>>>>> 
>>>>> 
>>>>>> Very strange - I can't seem to replicate it. Is there any chance
> that
>>>> you
>>>>> have < 8 actual cores on node12?
>>>>>> 
>>>>>> 
>>>>>> On Dec 18, 2013, at 4:53 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Ralph, sorry for confusing you.
>>>>>>> 
>>>>>>> At that time, I cut and paste the part of "cat $PBS_NODEFILE".
>>>>>>> I guess I didn't paste the last line by my mistake.
>>>>>>> 
>>>>>>> I retried the test and below one is exactly what I got when I did
> the
>>>>> test.
>>>>>>> 
>>>>>>> [mishima@manage ~]$ qsub -I -l nodes=node11:ppn=8+node12:ppn=8
>>>>>>> qsub: waiting for job 8338.manage.cluster to start
>>>>>>> qsub: job 8338.manage.cluster ready
>>>>>>> 
>>>>>>> [mishima@node11 ~]$ cat $PBS_NODEFILE
>>>>>>> node11
>>>>>>> node11
>>>>>>> node11
>>>>>>> node11
>>>>>>> node11
>>>>>>> node11
>>>>>>> node11
>>>>>>> node11
>>>>>>> node12
>>>>>>> node12
>>>>>>> node12
>>>>>>> node12
>>>>>>> node12
>>>>>>> node12
>>>>>>> node12
>>>>>>> node12
>>>>>>> [mishima@node11 ~]$ mpirun -np 4 -cpus-per-proc 4 -report-bindings
>>>>> myprog
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>> A request was made to bind to that would result in binding more
>>>>>>> processes than cpus on a resource:
>>>>>>> 
>>>>>>> Bind to:         CORE
>>>>>>> Node:            node12
>>>>>>> #processes:  2
>>>>>>> #cpus:          1
>>>>>>> 
>>>>>>> You can override this protection by adding the "overload-allowed"
>>>>>>> option to your binding directive.
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>> 
>>>>>>> Regards,
>>>>>>> 
>>>>>>> Tetsuya Mishima
>>>>>>> 
>>>>>>>> I removed the debug in #2 - thanks for reporting it
>>>>>>>> 
>>>>>>>> For #1, it actually looks to me like this is correct. If you look
> at
>>>>> your
>>>>>>> allocation, there are only 7 slots being allocated on node12, yet
> you
>>>>> have
>>>>>>> asked for 8 cpus to be assigned (2 procs with 2
>>>>>>>> cpus/proc). So the warning is in fact correct
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Dec 18, 2013, at 4:04 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Ralph, I found that openmpi-1.7.4rc1 was already uploaded. So
>>>> I'd
>>>>>>> like
>>>>>>>>> to report
>>>>>>>>> 3 issues mainly regarding -cpus-per-proc.
>>>>>>>>> 
>>>>>>>>> 1) When I use 2 nodes(node11,node12), which has 8 cores each(= 2
>>>>>>> sockets X
>>>>>>>>> 4 cores/socket),
>>>>>>>>> it starts to produce the error again as shown below. At least,
>>>>>>>>> openmpi-1.7.4a1r29646 did
>>>>>>>>> work well.
>>>>>>>>> 
>>>>>>>>> [mishima@manage ~]$ qsub -I -l nodes=2:ppn=8
>>>>>>>>> qsub: waiting for job 8336.manage.cluster to start
>>>>>>>>> qsub: job 8336.manage.cluster ready
>>>>>>>>> 
>>>>>>>>> [mishima@node11 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>> [mishima@node11 demos]$ cat $PBS_NODEFILE
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node11
>>>>>>>>> node12
>>>>>>>>> node12
>>>>>>>>> node12
>>>>>>>>> node12
>>>>>>>>> node12
>>>>>>>>> node12
>>>>>>>>> node12
>>>>>>>>> [mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4
>>>>> -report-bindings
>>>>>>>>> myprog
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>> A request was made to bind to that would result in binding more
>>>>>>>>> processes than cpus on a resource:
>>>>>>>>> 
>>>>>>>>> Bind to:         CORE
>>>>>>>>> Node:            node12
>>>>>>>>> #processes:  2
>>>>>>>>> #cpus:          1
>>>>>>>>> 
>>>>>>>>> You can override this protection by adding the "overload-allowed"
>>>>>>>>> option to your binding directive.
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>> 
>>>>>>>>> Of course it works well using only one node.
>>>>>>>>> 
>>>>>>>>> [mishima@node11 demos]$ mpirun -np 2 -cpus-per-proc 4
>>>>> -report-bindings
>>>>>>>>> myprog
>>>>>>>>> [node11.cluster:26238] MCW rank 0 bound to socket 0[core 0[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>>>>>>> [node11.cluster:26238] MCW rank 1 bound to socket 1[core 4[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>>>>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>>>>>>> Hello world from process 1 of 2
>>>>>>>>> Hello world from process 0 of 2
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2) Adding "-bind-to numa", it works but the message "bind:upward
>>>>> target
>>>>>>>>> NUMANode type NUMANode" appears.
>>>>>>>>> As far as I remember, I didn't see such a kind of message before.
>>>>>>>>> 
>>>>>>>>> mishima@node11 demos]$ mpirun -np 4 -cpus-per-proc 4
>>>> -report-bindings
>>>>>>>>> -bind-to numa myprog
>>>>>>>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode
>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode
>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode
>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node11.cluster:26260] [[8844,0],0] bind:upward target NUMANode
>>>> type
>>>>>>>>> NUMANode
>>>>>>>>> [node11.cluster:26260] MCW rank 0 bound to socket 0[core 0[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>>>>>>> [node11.cluster:26260] MCW rank 1 bound to socket 1[core 4[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>>>>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>>>>>>> [node12.cluster:23607] MCW rank 3 bound to socket 1[core 4[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 1[core 5[hwt 0]], socket 1[core 6[hwt 0]], so
>>>>>>>>> cket 1[core 7[hwt 0]]: [./././.][B/B/B/B]
>>>>>>>>> [node12.cluster:23607] MCW rank 2 bound to socket 0[core 0[hwt
> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>> cket 0[core 3[hwt 0]]: [B/B/B/B][./././.]
>>>>>>>>> Hello world from process 1 of 4
>>>>>>>>> Hello world from process 0 of 4
>>>>>>>>> Hello world from process 3 of 4
>>>>>>>>> Hello world from process 2 of 4
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 3) I use PGI compiler. It can not accept compiler switch
>>>>>>>>> "-Wno-variadic-macros", which is
>>>>>>>>> included in configure script.
>>>>>>>>> 
>>>>>>>>>       btl_usnic_CFLAGS="-Wno-variadic-macros"
>>>>>>>>> 
>>>>>>>>> I removed this switch, then I could continue to build 1.7.4rc1.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Hmmm...okay, I understand the scenario. Must be something in the
>>>>> algo
>>>>>>>>> when it only has one node, so it shouldn't be too hard to track
>>>> down.
>>>>>>>>>> 
>>>>>>>>>> I'm off on travel for a few days, but will return to this when I
>>>> get
>>>>>>>>> back.
>>>>>>>>>> 
>>>>>>>>>> Sorry for delay - will try to look at this while I'm gone, but
>>>> can't
>>>>>>>>> promise anything :-(
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Dec 10, 2013, at 6:58 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Hi Ralph, sorry for confusing.
>>>>>>>>>>> 
>>>>>>>>>>> We usually logon to "manage", which is our control node.
>>>>>>>>>>> From manage, we submit job or enter a remote node such as
>>>>>>>>>>> node03 by torque interactive mode(qsub -I).
>>>>>>>>>>> 
>>>>>>>>>>> At that time, instead of torque, I just did rsh to node03 from
>>>>> manage
>>>>>>>>>>> and ran myprog on the node. I hope you could understand what I
>>>> did.
>>>>>>>>>>> 
>>>>>>>>>>> Now, I retried with "-host node03", which still causes the
>>>> problem:
>>>>>>>>>>> (I comfirmed local run on manage caused the same problem too)
>>>>>>>>>>> 
>>>>>>>>>>> [mishima@manage ~]$ rsh node03
>>>>>>>>>>> Last login: Wed Dec 11 11:38:57 from manage
>>>>>>>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>>>> [mishima@node03 demos]$
>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -host node03
>>>> -report-bindings
>>>>>>>>>>> -cpus-per-proc 4 -map-by socket myprog
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>> A request was made to bind to that would result in binding more
>>>>>>>>>>> processes than cpus on a resource:
>>>>>>>>>>> 
>>>>>>>>>>> Bind to:         CORE
>>>>>>>>>>> Node:            node03
>>>>>>>>>>> #processes:  2
>>>>>>>>>>> #cpus:          1
>>>>>>>>>>> 
>>>>>>>>>>> You can override this protection by adding the
> "overload-allowed"
>>>>>>>>>>> option to your binding directive.
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>> 
>>>>>>>>>>> It' strange, but I have to report that "-map-by socket:span"
>>>> worked
>>>>>>>>> well.
>>>>>>>>>>> 
>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -host node03
>>>> -report-bindings
>>>>>>>>>>> -cpus-per-proc 4 -map-by socket:span myprog
>>>>>>>>>>> [node03.cluster:11871] MCW rank 2 bound to socket 1[core 8[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 3 bound to socket 1[core 12[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 4 bound to socket 2[core 16[hwt
>>>>> 0]],
>>>>>>>>> socket>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 5 bound to socket 2[core 20[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 6 bound to socket 3[core 24[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 7 bound to socket 3[core 28[hwt
>>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 0 bound to socket 0[core 0[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>> 
>>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>> [node03.cluster:11871] MCW rank 1 bound to socket 0[core 4[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>> 
>>>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On Dec 10, 2013, at 6:05 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I tried again with -cpus-per-proc 2 as shown below.
>>>>>>>>>>>>> Here, I found that "-map-by socket:span" worked well.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>>>>>> -cpus-per-proc
>>>>>>>>> 2
>>>>>>>>>>>>> -map-by socket:span myprog
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 2 bound to socket 1[core 8
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 9[hwt 0]]: [./././././././.][B/B/././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 3 bound to socket 1[core 10
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 11[hwt 0]]: [./././././././.][././B/B
>>>>>>>>>>>>> /./././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 4 bound to socket 2[core 16
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 2[core 17[hwt 0]]: [./././././././.][./././.
>>>>>>>>>>>>> /./././.][B/B/./././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 5 bound to socket 2[core 18
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 2[core 19[hwt 0]]: [./././././././.][./././.
>>>>>>>>>>>>> /./././.][././B/B/./././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 6 bound to socket 3[core 24
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 3[core 25[hwt 0]]: [./././././././.][./././.
>>>>>>>>>>>>> /./././.][./././././././.][B/B/./././././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 7 bound to socket 3[core 26
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 3[core 27[hwt 0]]: [./././././././.][./././.
>>>>>>>>>>>>> /./././.][./././././././.][././B/B/./././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 0 bound to socket 0[core 0
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./././././.][././././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10879] MCW rank 1 bound to socket 0[core 2
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 3[hwt 0]]: [././B/B/./././.][././././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>>>> Hello world from process 5 of 8> >>>>>>> Hello world from
>>>> process 3 of 8
>>>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>>>>>> -cpus-per-proc
>>>>>>>>> 2
>>>>>>>>>>>>> -map-by socket myprog
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 2 bound to socket 0[core 4
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 5[hwt 0]]: [././././B/B/./.][././././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 3 bound to socket 0[core 6
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 7[hwt 0]]: [././././././B/B][././././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 4 bound to socket 1[core 8
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 9[hwt 0]]: [./././././././.][B/B/././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 5 bound to socket 1[core 10
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 11[hwt 0]]: [./././././././.][././B/B
>>>>>>>>>>>>> /./././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 6 bound to socket 1[core 12
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 13[hwt 0]]: [./././././././.][./././.
>>>>>>>>>>>>> /B/B/./.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 7 bound to socket 1[core 14
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 15[hwt 0]]: [./././././././.][./././.
>>>>>>>>>>>>> /././B/B][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 0 bound to socket 0[core 0
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./././././.][././././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:10921] MCW rank 1 bound to socket 0[core 2
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 3[hwt 0]]: [././B/B/./././.][././././.
>>>>>>>>>>>>> /././.][./././././././.][./././././././.]
>>>>>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>>>>>> 
>>>>>>>>>>>>> "-np 8" and "-cpus-per-proc 4" just filled all sockets.
>>>>>>>>>>>>> In this case, I guess "-map-by socket:span" and "-map-by
>>>> socket"
>>>>>>> has
>>>>>>>>>>> same
>>>>>>>>>>>>> meaning.
>>>>>>>>>>>>> Therefore, there's no problem about that. Sorry for
> distubing.
>>>>>>>>>>>> 
>>>>>>>>>>>> No problem - glad you could clear that up :-)
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> By the way, through this test, I found another problem.
>>>>>>>>>>>>> Without torque manager and just using rsh, it causes the same
>>>>> error
>>>>>>>>>>> like
>>>>>>>>>>>>> below:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [mishima@manage openmpi-1.7]$ rsh node03
>>>>>>>>>>>>> Last login: Wed Dec 11 09:42:02 from manage
>>>>>>>>>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>>>>>> -cpus-per-proc
>>>>>>>>> 4
>>>>>>>>>>>>> -map-by socket myprog
>>>>>>>>>>>> 
>>>>>>>>>>>> I don't understand the difference here - you are simply
> starting
>>>>> it
>>>>>>>>> from>>>>> a different node? It looks like everything is expected
> to
>>>>> run local
>>>>>>> to
>>>>>>>>>>> mpirun, yes? So there is no rsh actually involved here.
>>>>>>>>>>>> Are you still running in an allocation?
>>>>>>>>>>>> 
>>>>>>>>>>>> If you run this with "-host node03" on the cmd line, do you
> see
>>>>> the
>>>>>>>>> same
>>>>>>>>>>> problem?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>>>> A request was made to bind to that would result in binding
> more
>>>>>>>>>>>>> processes than cpus on a resource:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Bind to:         CORE
>>>>>>>>>>>>> Node:            node03
>>>>>>>>>>>>> #processes:  2
>>>>>>>>>>>>> #cpus:          1
>>>>>>>>>>>>> 
>>>>>>>>>>>>> You can override this protection by adding the
>>>> "overload-allowed"
>>>>>>>>>>>>> option to your binding directive.
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>>>> [mishima@node03 demos]$
>>>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>>>>>> -cpus-per-proc
>>>>>>>>> 4
>>>>>>>>>>>>> myprog
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 2 bound to socket 1[core 8
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 3 bound to socket 1[core 12
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 4 bound to socket 2[core 16
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 5 bound to socket 2[core 20
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 6 bound to socket 3[core 24
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>>>> socket 3[core 27[hwt 0]]:>>>>>
>>>>>>> 
> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 7 bound to socket 3[core 28
> [hwt
>>>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 0 bound to socket 0[core 0
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>> [node03.cluster:11036] MCW rank 1 bound to socket 0[core 4
> [hwt
>>>>> 0]],
>>>>>>>>>>> socket
>>>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>>>> 
>>>>>>> 
> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hmmm...that's strange. I only have 2 sockets on my system,
> but
>>>>> let
>>>>>>>>> me
>>>>>>>>>>>>> poke around a bit and see what might be happening.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Dec 10, 2013, at 4:47 PM, tmish...@jcity.maeda.co.jp
> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks. I didn't know the meaning of "socket:span".
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> But it still causes the problem, which seems socket:span
>>>>> doesn't
>>>>>>>>>>> work.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [mishima@manage demos]$ qsub -I -l nodes=node03:ppn=32
>>>>>>>>>>>>>>> qsub: waiting for job 8265.manage.cluster to start
>>>>>>>>>>>>>>> qsub: job 8265.manage.cluster ready
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [mishima@node03 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>>>>>>>> [mishima@node03 demos]$ mpirun -np 8 -report-bindings
>>>>>>>>> -cpus-per-proc
>>>>>>>>>>> 4
>>>>>>>>>>>>>>> -map-by socket:span myprog
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 2 bound to socket 1[core 8
>>>> [hwt
>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 3 bound to socket 1[core 12
>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 4 bound to socket 2[core 16
>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 5 bound to socket 2[core 20
>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 6 bound to socket 3[core 24
>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 7 bound to socket 3[core 28
>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 0 bound to socket 0[core 0
>>>> [hwt
>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>> [node03.cluster:10262] MCW rank 1 bound to socket 0[core 4
>>>> [hwt
>>>>>>> 0]],
>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>>>>>> 
>>>>>>>>> 
>>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>> Hello world from process 0 of 8>>>>>>>>>>>>> Hello world
> from process 3 of 8
>>>>>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> No, that is actually correct. We map a socket until full,
>>>> then
>>>>>>>>> move
>>>>>>>>>>> to
>>>>>>>>>>>>>>> the next. What you want is --map-by socket:span
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Dec 10, 2013, at 3:42 PM, tmish
> i...@jcity.maeda.co.jp
>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I had a time to try your patch yesterday using
>>>>>>>>>>> openmpi-1.7.4a1r29646.
>>>>>>>>>>>>>>>>>>>>>>>> It stopped the error but unfortunately "mapping by
>>>>>>>>> socket" itself
>>>>>>>>>>>>>>> didn't
>>>>>>>>>>>>>>>>> work
>>>>>>>>>>>>>>>>> well as shown bellow:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> [mishima@manage demos]$ qsub -I -l nodes=1:ppn=32
>>>>>>>>>>>>>>>>> qsub: waiting for job 8260.manage.cluster to start
>>>>>>>>>>>>>>>>> qsub: job 8260.manage.cluster ready
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> [mishima@node04 ~]$ cd ~/Desktop/openmpi-1.7/demos/
>>>>>>>>>>>>>>>>> [mishima@node04 demos]$ mpirun -np 8 -report-bindings
>>>>>>>>>>> -cpus-per-proc
>>>>>>>>>>>>> 4
>>>>>>>>>>>>>>>>> -map-by socket myprog
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 2 bound to socket 1[core
> 8
>>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 3 bound to socket 1[core
> 12
>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 4 bound to socket 2[core
> 16
>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 5 bound to socket 2[core
> 20
>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 6 bound to socket 3[core
> 24
>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 7 bound to socket 3[core
> 28
>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 0 bound to socket 0[core
> 0
>>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>> [node04.cluster:27489] MCW rank 1 bound to socket 0[core
> 4
>>>>> [hwt
>>>>>>>>> 0]],
>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>> Hello world from process 2 of 8
>>>>>>>>>>>>>>>>> Hello world from process 1 of 8
>>>>>>>>>>>>>>>>> Hello world from process 3 of 8
>>>>>>>>>>>>>>>>> Hello world from process 0 of 8
>>>>>>>>>>>>>>>>> Hello world from process 6 of 8
>>>>>>>>>>>>>>>>> Hello world from process 5 of 8
>>>>>>>>>>>>>>>>> Hello world from process 4 of 8
>>>>>>>>>>>>>>>>> Hello world from process 7 of 8
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I think this should be like this:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> rank 00
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>> rank 01
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>> rank 02
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I fixed this under the trunk (was an issue regardless of
>>>> RM)
>>>>>>> and
>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>> scheduled it for 1.7.4.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Nov 25, 2013, at 4:22 PM, tmish...@jcity.maeda.co.jp
>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thank you very much for your quick response.>
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I'm afraid to say that I found one more issuse...
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> It's not so serious. Please check it when you have a
> lot
>>>> of
>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> The problem is cpus-per-proc with -map-by option under
>>>>> Torque
>>>>>>>>>>>>>>> manager.
>>>>>>>>>>>>>>>>>>> It doesn't work as shown below. I guess you can get the
>>>>> same
>>>>>>>>>>>>>>>>>>> behaviour under Slurm manager.
>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Of course, if I remove -map-by option, it works quite
>>>> well.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [mishima@manage testbed2]$ qsub -I -l nodes=1:ppn=32
>>>>>>>>>>>>>>>>>>> qsub: waiting for job 8116.manage.cluster to start
>>>>>>>>>>>>>>>>>>> qsub: job 8116.manage.cluster ready
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [mishima@node03 ~]$ cd ~/Ducom/testbed2
>>>>>>>>>>>>>>>>>>> [mishima@node03 testbed2]$ mpirun -np 8
> -report-bindings
>>>>>>>>>>>>>>> -cpus-per-proc
>>>>>>>>>>>>>>>>> 4
>>>>>>>>>>>>>>>>>>> -map-by socket mPre
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>>> A request was made to bind to that would result in
>>>> binding
>>>>>>> more
>>>>>>>>>>>>>>>>>>> processes than cpus on a resource:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Bind to:         CORE
>>>>>>>>>>>>>>>>>>> Node:            node03>>>>>>> #processes:  2
>>>>>>>>>>>>>>>>>>> #cpus:          1
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> You can override this protection by adding the
>>>>>>>>> "overload-allowed"
>>>>>>>>>>>>>>>>>>> option to your binding directive.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [mishima@node03 testbed2]$ mpirun -np 8
> -report-bindings
>>>>>>>>>>>>>>> -cpus-per-proc
>>>>>>>>>>>>>>>>> 4
>>>>>>>>>>>>>>>>>>> mPre
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 2 bound to socket 1
> [core
>>>> 8
>>>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], s
>>>>>>>>>>>>>>>>>>> ocket 1[core 11[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][B/B/B/B/./././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 3 bound to socket 1
> [core
>>>> 12
>>>>>>> [hwt
>>>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 1[core 13[hwt 0]], socket 1[core 14[hwt 0]],
>>>>>>>>>>>>>>>>>>> socket 1[core 15[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][././././B/B/B/B][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 4 bound to socket 2
> [core
>>>> 16
>>>>>>> [hwt
>>>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 2[core 17[hwt 0]], socket 2[core 18[hwt 0]],
>>>>>>>>>>>>>>>>>>> socket 2[core 19[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][B/B/B/B/./././.][./././././././.]
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 5 bound to socket 2
> [core
>>>> 20
>>>>>>> [hwt
>>>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 2[core 21[hwt 0]], socket 2[core 22[hwt 0]],
>>>>>>>>>>>>>>>>>>> socket 2[core 23[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][././././B/B/B/B][./././././././.]
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 6 bound to socket 3
> [core
>>>> 24
>>>>>>> [hwt
>>>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 3[core 25[hwt 0]], socket 3[core 26[hwt 0]],
>>>>>>>>>>>>>>>>>>> socket 3[core 27[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [./././././././.][./././././././.][./././././././.][B/B/B/B/./././.]
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 7 bound to socket 3
> [core
>>>> 28
>>>>>>> [hwt
>>>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 3[core 29[hwt 0]], socket 3[core 30[hwt 0]],
>>>>>>>>>>>>>>>>>>> socket 3[core 31[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> [./././././././.][./././././././.][./././././././.][././././B/B/B/B]>>>>>>>>>>>>>
> 
>>>> 
>>>>> [node03.cluster:18128] MCW rank 0 bound to socket 0[core 0
>>>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>>>>>>>>>>>>>>>> cket 0[core 3[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [B/B/B/B/./././.][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>>>> [node03.cluster:18128] MCW rank 1 bound to socket 0
> [core
>>>> 4
>>>>>>> [hwt
>>>>>>>>>>> 0]],
>>>>>>>>>>>>>>>>> socket
>>>>>>>>>>>>>>>>>>> 0[core 5[hwt 0]], socket 0[core 6[hwt 0]], so
>>>>>>>>>>>>>>>>>>> cket 0[core 7[hwt 0]]:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>> 
> [././././B/B/B/B][./././././././.][./././././././.][./././././././.]
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>> Regards,
>>>>>>>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Fixed and scheduled to move to 1.7.4. Thanks again!
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Nov 17, 2013, at 6:11 PM, Ralph Castain
>>>>>>> <r...@open-mpi.org>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks! That's precisely where I was going to look
> when
>>>> I
>>>>>>> had
>>>>>>>>>>>>>>> time :-)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I'll update tomorrow.
>>>>>>>>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Sun, Nov 17, 2013 at 7:01 PM,
>>>>>>>>>>>>> <tmish...@jcity.maeda.co.jp>wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Hi Ralph,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> This is the continuous story of "Segmentation fault in
>>>>>>>>> oob_tcp.c
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>> openmpi-1.7.4a1r29646".
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I found the cause.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Firstly, I noticed that your hostfile can work and
> mine
>>>>> can
>>>>>>>>> not.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Your host file:
>>>>>>>>>>>>>>>>>>>> cat hosts
>>>>>>>>>>>>>>>>>>>> bend001 slots=12
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> My host file:
>>>>>>>>>>>>>>>>>>>> cat hosts
>>>>>>>>>>>>>>>>>>>> node08
>>>>>>>>>>>>>>>>>>>> node08
>>>>>>>>>>>>>>>>>>>> ...(total 8 lines)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I modified my script file to add "slots=1" to each
> line
>>>> of
>>>>>>> my
>>>>>>>>>>>>>>> hostfile
>>>>>>>>>>>>>>>>>>>> just before launching mpirun. Then it worked.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> My host file(modified):
>>>>>>>>>>>>>>>>>>>> cat hosts
>>>>>>>>>>>>>>>>>>>> node08 slots=1
>>>>>>>>>>>>>>>>>>>> node08 slots=1
>>>>>>>>>>>>>>>>>>>> ...(total 8 lines)
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Secondary, I confirmed that there's a slight
> difference
>>>>>>>>> between
>>>>>>>>>>>>>>>>>>>> orte/util/hostfile/hostfile.c of 1.7.3 and that of
>>>>>>>>>>> 1.7.4a1r29646.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> $ diff
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>> 
>>>> hostfile.c.org ../../../../openmpi-1.7.3/orte/util/hostfile/hostfile.c
>>>>>>>>>>>>>>>>>>>> 394,401c394,399
>>>>>>>>>>>>>>>>>>>> <     if (got_count) {
>>>>>>>>>>>>>>>>>>>> <         node->slots_given = true;
>>>>>>>>>>>>>>>>>>>> <     } else if (got_max) {
>>>>>>>>>>>>>>>>>>>> <         node->slots = node->slots_max;
>>>>>>>>>>>>>>>>>>>> <         node->slots_given = true;
>>>>>>>>>>>>>>>>>>>> <     } else {
>>>>>>>>>>>>>>>>>>>> <         /* should be set by obj_new, but just to be
>>>>> clear
>>>>>>> */
>>>>>>>>>>>>>>>>>>>> <         node->slots_given = false;
>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>>> if (!got_count) {
>>>>>>>>>>>>>>>>>>>>> if (got_max) {
>>>>>>>>>>>>>>>>>>>>>    node->slots = node->slots_max;
>>>>>>>>>>>>>>>>>>>>> } else {
>>>>>>>>>>>>>>>>>>>>>    ++node->slots;>>>>>>>>>>>>>    }
>>>>>>>>>>>>>>>>>>>> ....
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Finally, I added the line 402 below just as a
> tentative
>>>>>>> trial.
>>>>>>>>>>>>>>>>>>>> Then, it worked.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> cat -n orte/util/hostfile/hostfile.c:
>>>>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>>>> 394      if (got_count) {
>>>>>>>>>>>>>>>>>>>> 395          node->slots_given = true;
>>>>>>>>>>>>>>>>>>>> 396      } else if (got_max) {
>>>>>>>>>>>>>>>>>>>> 397          node->slots = node->slots_max;
>>>>>>>>>>>>>>>>>>>> 398          node->slots_given = true;
>>>>>>>>>>>>>>>>>>>> 399      } else {
>>>>>>>>>>>>>>>>>>>> 400          /* should be set by obj_new, but just to
> be
>>>>>>> clear
>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>> 401          node->slots_given
>>>>>>> = false;
>>>>>>>>>>>>>>>>>>>> 402          ++node->slots; /* added by tmishima */
>>>>>>>>>>>>>>>>>>>> 403      }
>>>>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Please fix the problem properly, because it's just
> based
>>>>> on
>>>>>>> my
>>>>>>>>>>>>>>>>>>>> random guess. It's related to the treatment of
> hostfile
>>>>>>> where
>>>>>>>>>>>>> slots
>>>>>>>>>>>>>>>>>>>> information is not given.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
> http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
> 
>>>> 
>>>>> 
>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> users mailing list>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> 
>>>>>>> 
> users@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> users mailing list
>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>> 
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to