Re: [O-MPI users] Infiniband performance problems (mvapi)

Mike Houston Mon, 31 Oct 2005 17:37:05 -0500

Better, but still having issues at lots of outstanding messages:

mpirun -np 2 -mca mpi_leave_pinned 1 -mca btl_mvapi_flags 2mpi_bandwidth 1000 131072

131072  669.574904 (MillionBytes/sec)   638.556389(MegaBytes/sec)

mpirun -np 2 -mca mpi_leave_pinned 1 -mca btl_mvapi_flags 2mpi_bandwidth 10000 131072

131072  115.873284 (MillionBytes/sec)   110.505375(MegaBytes/sec)

Sorry to be such a pain... We need the speed of MVAPICH with thethreading support of OpenMPI...


-Mike


Tim S. Woodall wrote:

Mike,

There appears to be an issue in our mvapi get protocol. To temporarily
disable this:

/u/twoodall> orterun -np 2 -mca mpi_leave_pinned 1 -mca btl_mvapi_flags 2 ./bw 
25 131072
131072  801.580272 (MillionBytes/sec)   764.446518(MegaBytes/sec)


Mike Houston wrote:
What's the ETA, or should I try grabbing from cvs?

-Mike

Tim S. Woodall wrote:
Mike,

I believe was probably corrected today and should be in the
next release candidate.

Thanks,
Tim

Mike Houston wrote:
Woops, spoke to soon. The performance quoted was not actually goingbetween nodes. Actually using the network with the pinned option gives:
[0,1,0][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress][0,1,1][btl_mvapi_component.c:631:mca_btl_mvapi_component_progress] Goterror : VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb74a1c18Got error :VAPI_WR_FLUSH_ERR, Vendor code : 0 Frag : 0xb73e1720
repeated many times.

-Mike

Mike Houston wrote:
That seems to work with the pinning option enabled. THANKS!Now I'll go back to testing my real code. I'm getting 700MB/s formessages >=128KB. This is a little bit lower than MVAPICH, 10-20%, butstill pretty darn good. My guess is that I can play with the settingmore to tweak up performance. Now if I can get the tcp layer working,I'm pretty much good to go.
Any word on an SDP layer? I can probably modify the tcp layer quicklyto do SDP, but I thought I would ask.
-Mike

Tim S. Woodall wrote:
Hello Mike,

Mike Houston wrote:
When only sending a few messages, we get reasonably good IB performance,~500MB/s (MVAPICH is 850MB/s). However, if I crank the number ofmessages up, we drop to 3MB/s(!!!). This is with the OSU NBCLmpi_bandwidth test. We are running Mellanox IB Gold 1.8 with 3.3.3firmware on PCI-X (Couger) boards. Everything works with MVAPICH, butwe really need the thread support in OpenMPI.
Ideas? I noticed there are a plethora of runtime options configurablefor mvapi. Do I need to tweak these to get performacne up?
You might try running w/ the:

mpirun -mca mpi_leave_pinned 1

Which will cause mvapi port to maintain an mru cache of registrations,
rather than dynamically pinning/unpinning memory.

If this does not resolve the BW problems, try increasing the
resources allocated to each connection:

-mca btl_mvapi_rd_min 128
-mca btl_mvapi_rd_max 256

Also can you forward me a copy of the test code or a reference to it?

Thanks,
Tim
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [O-MPI users] Infiniband performance problems (mvapi)

Reply via email to