Re: [OMPI users] peformance abnormality with openib and tcp framework

2018-05-15 Thread Blade Shieh
Hi Gilles,
Thank you for pointing out my error on *-N*.
And you are right that I opened opensmd service before so the link up can
be set up correctly. But many IB-related command cannot be executed
correctly, like ibhosts and ibdiagnet.
As for pml, I am pretty sure I was using ob1, because ompi_info shows there
was no ucx or mxm and ob1 has highest priority.

Best regards,
Xie Bin

Gilles Gouaillardet  于 2018年5月15日 周二 10:09写道:

> Xie Bin,
>
>
> According to the man page, -N is equivalent to npernode, which is
> equivalent to --map-by ppr:N:node.
>
> This is *not* equivalent to -map-by node :
>
> The former packs tasks to the same node, and the latter scatters tasks
> accross the nodes
>
>
> [gilles@login ~]$ mpirun --host n0:2,n1:2 -N 2 --tag-output hostname |
> sort
> [1,0]:n0
> [1,1]:n0
> [1,2]:n1
> [1,3]:n1
>
>
> [gilles@login ~]$ mpirun --host n0:2,n1:2 -np 4 --tag-output -map-by
> node hostname | sort
> [1,0]:n0
> [1,1]:n1
> [1,2]:n0
> [1,3]:n1
>
>
> I am pretty sure a subnet manager was ran at some point in time (so your
> HCA can get their identifier).
>
> /* feel free to reboot your nodes and see if ibstat still shows the
> adapters as active */
>
>
> Note you might also use --mca pml ob1 in order to make sure mxm nor ucx
> are used
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 5/15/2018 10:45 AM, Blade Shieh wrote:
> > Hi, George:
> > My command lines are:
> > 1) single node
> > mpirun --allow-run-as-root -mca btl self,tcp(or openib) -mca
> > btl_tcp_if_include eth2 -mca btl_openib_if_include mlx5_0 -x
> > OMP_NUM_THREADS=2 -n 32 myapp
> > 2) 2-node cluster
> > mpirun --allow-run-as-root -mca btl ^tcp(or ^openib) -mca
> > btl_tcp_if_include eth2 -mca btl_openib_if_include mlx5_0 -x
> > OMP_NUM_THREADS=4 -N 16 myapp
> >
> > In 2nd condition, I used -N, which is equal to --map-by node.
> >
> > Best regards,
> > Xie Bin
> >
> >
> > George Bosilca mailto:bosi...@icl.utk.edu>> 于
> > 2018年5月15日 周二 02:07写道:
> >
> > Shared memory communication is important for multi-core platforms,
> > especially when you have multiple processes per node. But this is
> > only part of your issue here.
> >
> > You haven't specified how your processes will be mapped on your
> > resources. As a result rank 0 and 1 will be on the same node, so
> > you are testing the shared memory support of whatever BTL you
> > allow. In this case the performance will be much better for TCP
> > than for IB, simply because you are not using your network, but
> > its capacity to move data across memory banks. In such an
> > environment, TCP translated to a memcpy plus a system call, which
> > is much faster than IB. That being said, it should not matter
> > because shared memory is there to cover this case.
> >
> > Add "--map-by node" to your mpirun command to measure the
> > bandwidth between nodes.
> >
> >   George.
> >
> >
> >
> > On Mon, May 14, 2018 at 5:04 AM, Blade Shieh  > <mailto:bladesh...@gmail.com>> wrote:
> >
> >
> > Hi, Nathan:
> > Thanks for you reply.
> > 1) It was my mistake not to notice usage of osu_latency. Now
> > it worked well, but still poorer in openib.
> > 2) I did not use sm or vader because I wanted to check
> > performance between tcp and openib. Besides, I will run the
> > application in cluster, so vader is not so important.
> > 3) Of course, I tried you suggestions. I used ^tcp/^openib and
> > set btl_openib_if_include to mlx5_0 in a two-node cluster (IB
> > direcet-connected). The result did not change -- IB still
> > better in MPI benchmark but poorer in my applicaion.
> >
> > Best Regards,
> > Xie Bin
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] peformance abnormality with openib and tcp framework

2018-05-14 Thread Blade Shieh
Hi, George:
My command lines are:
1) single node
mpirun --allow-run-as-root -mca btl self,tcp(or openib) -mca
btl_tcp_if_include eth2 -mca btl_openib_if_include mlx5_0 -x
OMP_NUM_THREADS=2 -n 32 myapp
2) 2-node cluster
mpirun --allow-run-as-root -mca btl ^tcp(or ^openib) -mca
btl_tcp_if_include eth2 -mca btl_openib_if_include mlx5_0 -x
OMP_NUM_THREADS=4 -N 16 myapp

In 2nd condition, I used -N, which is equal to --map-by node.

Best regards,
Xie Bin


George Bosilca  于 2018年5月15日 周二 02:07写道:

> Shared memory communication is important for multi-core platforms,
> especially when you have multiple processes per node. But this is only part
> of your issue here.
>
> You haven't specified how your processes will be mapped on your resources.
> As a result rank 0 and 1 will be on the same node, so you are testing the
> shared memory support of whatever BTL you allow. In this case the
> performance will be much better for TCP than for IB, simply because you are
> not using your network, but its capacity to move data across memory banks.
> In such an environment, TCP translated to a memcpy plus a system call,
> which is much faster than IB. That being said, it should not matter because
> shared memory is there to cover this case.
>
> Add "--map-by node" to your mpirun command to measure the bandwidth
> between nodes.
>
>   George.
>
>
>
> On Mon, May 14, 2018 at 5:04 AM, Blade Shieh  wrote:
>
>>
>> Hi, Nathan:
>> Thanks for you reply.
>> 1) It was my mistake not to notice usage of osu_latency. Now it worked
>> well, but still poorer in openib.
>> 2) I did not use sm or vader because I wanted to check performance
>> between tcp and openib. Besides, I will run the application in cluster, so
>> vader is not so important.
>> 3) Of course, I tried you suggestions. I used ^tcp/^openib and set
>> btl_openib_if_include to mlx5_0 in a two-node cluster (IB
>> direcet-connected).  The result did not change -- IB still better in MPI
>> benchmark but poorer in my applicaion.
>>
>> Best Regards,
>> Xie Bin
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] peformance abnormality with openib and tcp framework

2018-05-14 Thread Blade Shieh
Hi, John:

You are right on the network framework. I do have no IB switch and just
connect the servers with an IB cable. I did not even open the opensmd
service because it seems unnecessary in this situation. Can this be the
reason why IB performs poorer?

Interconnection details are in the attachment.



Best Regards,

Xie Bin


John Hearns via users  于 2018年5月14日 周一 17:45写道:

> Xie Bin,  I do hate to ask this.  You say  "in a two-node cluster (IB
> direcet-connected). "
> Does that mean that you have no IB switch, and that there is a single IB
> cable joining up these two servers?
> If so please run:ibstatusibhosts   ibdiagnet
> I am trying to check if the IB fabric is functioning properly in that
> situation.
> (Also need to check if there is o Subnet Manager  - so run   sminfo)
>
> But you do say that the IMB test gives good results for IB, so you must
> have IB working properly.
> Therefore I am an idiot...
>
>
>
> On 14 May 2018 at 11:04, Blade Shieh  wrote:
>
>>
>> Hi, Nathan:
>> Thanks for you reply.
>> 1) It was my mistake not to notice usage of osu_latency. Now it worked
>> well, but still poorer in openib.
>> 2) I did not use sm or vader because I wanted to check performance
>> between tcp and openib. Besides, I will run the application in cluster, so
>> vader is not so important.
>> 3) Of course, I tried you suggestions. I used ^tcp/^openib and set
>> btl_openib_if_include to mlx5_0 in a two-node cluster (IB
>> direcet-connected).  The result did not change -- IB still better in MPI
>> benchmark but poorer in my applicaion.
>>
>> Best Regards,
>> Xie Bin
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


IB-direct-connect.tgz
Description: application/gzip
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] peformance abnormality with openib and tcp framework

2018-05-14 Thread Blade Shieh
Hi, Nathan:
Thanks for you reply.
1) It was my mistake not to notice usage of osu_latency. Now it worked
well, but still poorer in openib.
2) I did not use sm or vader because I wanted to check performance between
tcp and openib. Besides, I will run the application in cluster, so vader is
not so important.
3) Of course, I tried you suggestions. I used ^tcp/^openib and set
btl_openib_if_include to mlx5_0 in a two-node cluster (IB
direcet-connected).  The result did not change -- IB still better in MPI
benchmark but poorer in my applicaion.

Best Regards,
Xie Bin
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] peformance abnormality with openib and tcp framework

2018-05-13 Thread Blade Shieh
/** The problem ***/

I have a cluster with 10GE ethernet and 100Gb infiniband. While running my
application - CAMx, I found that the performance with IB is not as good as
ethernet. That is confusing because IB latency and bandwith is
undoubtablely better than ethernet, which is proven by MPI benchmark
IMB-MPI1 and osu.



/** software stack ***/

centos7.4 with kernel 4.11.0-45.6.1.el7a.aarch64

MLNX_OFED_LINUX-4.3-1.0.1.0 from
http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers

gnu7.3 from OpenHPC release.   yun install
gnu7-compilers-ohpc-7.3.0-43.1.aarch64

openmpi3 from OpenHPC release.  yum install
openmpi3-gnu7-ohpc-3.0.0-36.4.aarch64

CAMx 6.4.0 from http://www.camx.com/

IMB from https://github.com/intel/mpi-benchmarks

OSU from http://mvapich.cse.ohio-state.edu/benchmarks/





/** command lines are /



(time mpirun --allow-run-as-root -mca btl self,openib  -x OMP_NUM_THREADS=2
-n 32 -mca btl_tcp_if_include eth2
../../src/CAMx.v6.40.openMPI.gfortranomp.ompi) > camx_openib_log 2>&1

(time mpirun --allow-run-as-root -mca btl self,tcp  -x OMP_NUM_THREADS=2 -n
32 -mca btl_tcp_if_include eth2
../../src/CAMx.v6.40.openMPI.gfortranomp.ompi) > camx_tcp_log 2>&1



(time mpirun --allow-run-as-root -mca btl self,openib  -x OMP_NUM_THREADS=2
-n 32 -mca btl_tcp_if_include eth2 IMB-MPI1 allreduce -msglog 8 -npmin
1000) > IMB_openib_log 2>&1

(time mpirun --allow-run-as-root -mca btl self,tcp  -x OMP_NUM_THREADS=2 -n
32 -mca btl_tcp_if_include eth2 IMB-MPI1 allreduce -msglog 8 -npmin 1000) >
IMB_tcp_log 2>&1



(time mpirun --allow-run-as-root -mca btl self,openib  -x OMP_NUM_THREADS=2
-n 32 -mca btl_tcp_if_include eth2 osu_latency) > osu_openib_log 2>&1

(time mpirun --allow-run-as-root -mca btl self,tcp  -x OMP_NUM_THREADS=2 -n
32 -mca btl_tcp_if_include eth2 osu_latency) > osu_tcp_log 2>&1



/** about openmpi and network config */



Please refer to relevant log files in the attachment.



*Best Regards,*

*Xie Bin*


ompi_support.tar.bz2
Description: application/bzip2
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users