Looks like there is no issue in 1.8.4 except for the message coalescing bug. Ralph, Howard, and I agree that disabling message coalescing for 1.8.4 is the safest way forward. We can back-port the real fix for an eventual 1.8.5. Message rates no longer seem to care about message coalescing in the openib btl anymore. We beat mvapich handily without the feature.
-Nathan On Tue, Nov 04, 2014 at 10:27:56PM +0000, Jeff Squyres (jsquyres) wrote: > That sounds fine, but I think Steve's point is that he is being bitten by > this bug now, so it would probably be good to even include this one > particular fix in 1.8.4. > > > On Nov 4, 2014, at 5:24 PM, Nathan Hjelm <hje...@lanl.gov> wrote: > > > Going to put the RFC out today with a timeout of about 2 weeks. This > > will give me some time to talk with other Open MPI developers > > face-to-face at SC14. > > > > If the RFC fails I will still bring that and a couple of other fixes > > into the master. > > > > -Nathan > > > > On Tue, Nov 04, 2014 at 04:06:45PM -0600, Steve Wise wrote: > >> Ok, sounds like I should let you continue the good work! :) When do you > >> plan to merge this into ompi proper? > >> > >> On 11/4/2014 3:58 PM, Nathan Hjelm wrote: > >> > >> That certainly addresses part of the problem. I am working on a complete > >> revamp of the btl RDMA interface. It contains this fix: > >> > >> https://github.com/hjelmn/ompi/commit/66fa429e306beb9fca59da0a4554e9b98d788316 > >> > >> -Nathan > >> > >> On Tue, Nov 04, 2014 at 03:27:23PM -0600, Steve Wise wrote: > >> > >> I found the bug. Here is the fix: > >> > >> [root@stevo1 openib]# git diff > >> diff --git a/opal/mca/btl/openib/btl_openib_component.c > >> b/opal/mca/btl/openib/btl_openib_component.c > >> index d876e21..8a5ea82 100644 > >> --- a/opal/mca/btl/openib/btl_openib_component.c > >> +++ b/opal/mca/btl/openib/btl_openib_component.c > >> @@ -1960,9 +1960,8 @@ static int init_one_device(opal_list_t *btl_list, > >> struct ibv_device* ib_dev) > >> } > >> > >> /* If the MCA param was specified, skip all the checks */ > >> - if ( MCA_BASE_VAR_SOURCE_COMMAND_LINE || > >> - MCA_BASE_VAR_SOURCE_ENV == > >> - mca_btl_openib_component.receive_queues_source) { > >> + if (MCA_BASE_VAR_SOURCE_COMMAND_LINE == > >> mca_btl_openib_component.receive_queues_source|| > >> + MCA_BASE_VAR_SOURCE_ENV == > >> mca_btl_openib_component.receive_queues_source) { > >> goto good; > >> } > >> > >> > >> On 11/4/2014 3:08 PM, Nathan Hjelm wrote: > >> > >> I have run into the issue as well. I will open a pull request for 1.8.4 > >> as part of a patch fixing the coalescing issues. > >> > >> -Nathan > >> > >> On Tue, Nov 04, 2014 at 02:50:30PM -0600, Steve Wise wrote: > >> > >> On 11/4/2014 2:09 PM, Steve Wise wrote: > >> > >> Hi, > >> > >> I'm running ompi top-o-tree from github and seeing an openib btl issue > >> where the qp/srq configuration is incorrect for the given device id. This > >> works fine in 1.8.4rc1, but I see the problem in top-of-tree. A simple 2 > >> node IMB-MPI1 pingpong fails to get the ranks setup. I see this logged: > >> > >> /opt/ompi-trunk/bin/mpirun --allow-run-as-root --np 2 --host stevo1,stevo2 > >> --mca btl openib,sm,self /opt/ompi-trunk/bin/IMB-MPI1 pingpong > >> > >> > >> Adding this works around the issue: > >> > >> --mca btl_openib_receive_queues P,65536,64 > >> > >> I also confirmed that opal_btl_openib_ini_query() is getting the correct > >> receive_queues string from the .ini file on both nodes for the cxgb4 > >> device... > >> > >> > >> > >> <snip> > >> > >> -------------------------------------------------------------------------- > >> > >> The Open MPI receive queue configuration for the OpenFabrics devices > >> on two nodes are incompatible, meaning that MPI processes on two > >> specific nodes were unable to communicate with each other. This > >> generally happens when you are using OpenFabrics devices from > >> different vendors on the same network. You should be able to use the > >> mca_btl_openib_receive_queues MCA parameter to set a uniform receive > >> queue configuration for all the devices in the MPI job, and therefore > >> be able to run successfully. > >> > >> Local host: stevo2 > >> Local adapter: cxgb4_0 (vendor 0x1425, part ID 21520) > >> Local queues: > >> P,128,256,192,128:S,2048,1024,1008,64:S,12288,1024,1008,64:S,65536,1024,1008,64 > >> > >> Remote host: stevo1 > >> Remote adapter: (vendor 0x1425, part ID 21520) > >> Remote queues: P,65536,64 > >> ---------------------------------------------------------------------------- > >> > >> > >> The stevo1 rank has the correct queue settings: P,65536,64. For some > >> reason, stevo2 has the wrong settings, even though it has the correct > >> device id info. > >> > >> Any suggestions on debugging this? Like where to dig in the src to see if > >> somehow the .ini parsing is broken... > >> > >> > >> Thanks, > >> > >> Steve. > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/11/16179.php > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/11/16180.php > >> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/11/16181.php > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/11/16182.php > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/11/16184.php > > > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2014/11/16185.php > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2014/11/16187.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16188.php
pgpZJLsI29Uba.pgp
Description: PGP signature