I don't think so. The Mellanox change that caused this issue should not be in 1.6.
-Nathan On Fri, Jun 21, 2013 at 05:18:16PM +0000, Jeff Squyres (jsquyres) wrote: > Does this need to go to v1.6? > > On Jun 21, 2013, at 11:59 AM, Nathan Hjelm <hje...@lanl.gov> wrote: > > > Found my original fix (still don't know why I never pushed it) and I think > > George is correct. This should in both the single and multiple get cases. > > > > -Nathan > > > > On Fri, Jun 21, 2013 at 05:52:28PM +0200, George Bosilca wrote: > >> The amount of bytes received is atomically updated on the completion > >> callback, and the completion test is clearly spelled-out int the > >> recv_request_pml_complete_check function (of course minus the lock part). > >> Rolf I think your patch is correct. > >> > >> That being said req_bytes_expected is a special value, one that should > >> only be used to check from truncation. Otherwise, req_bytes_packed is the > >> value we should compare against. > >> > >> George. > >> > >> On Jun 21, 2013, at 17:40 , Nathan Hjelm <hje...@lanl.gov> wrote: > >> > >>> I thought I fixed this problem awhile back (though looking at the code > >>> its possible I never committed the fix). I will have to look through my > >>> local repository and see what happened to that fix. Your fix might not > >>> work correctly since a RGET can be broken up into multiple get > >>> operations. It may work, I would just need to test it to make sure. > >>> > >>> -Nathan > >>> > >>> On Fri, Jun 21, 2013 at 08:25:29AM -0700, Rolf vandeVaart wrote: > >>>> I ran into a hang in a test in which the sender sends less data than the > >>>> receiver is expecting. For example, the following shows the receiver > >>>> expecting twice what the sender is sending. > >>>> > >>>> Rank 0: MPI_Send(buf, BUFSIZE, MPI_INT, 1, 99, MPI_COMM_WORLD) > >>>> Rank 1: MPI_Recv(buf, BUFSIZE*2, MPI_INT, 0, 99, MPI_COMM_WORLD) > >>>> > >>>> This is also reproducible using one of the intel tests and adjusting the > >>>> eager value for the openib BTL. > >>>> > >>>> ? mpirun -np 2 -host frick,frack -mca btl_openib_eager_limit 56 > >>>> MPI_Send_overtake_c > >>>> > >>>> In most cases, this works just fine. However, when the PML protocol > >>>> used is the RGET protocol, the test hangs. Below is a proposed fix for > >>>> this issue. > >>>> I believe we want to be checking against req_bytes_packed rather than > >>>> req_bytes_expected as req_bytes_expected is what the user originally > >>>> told us. > >>>> Otherwise, with the current code, we never send a FIN message back to > >>>> the sender. > >>>> > >>>> Any thoughts? > >>>> > >>>> [rvandevaart@sm065 ompi-trunk]$ svn diff > >>>> ompi/mca/pml/ob1/pml_ob1_recvreq.c > >>>> Index: ompi/mca/pml/ob1/pml_ob1_recvreq.c > >>>> =================================================================== > >>>> --- ompi/mca/pml/ob1/pml_ob1_recvreq.c (revision 28633) > >>>> +++ ompi/mca/pml/ob1/pml_ob1_recvreq.c (working copy) > >>>> @@ -335,7 +335,7 @@ > >>>> /* is receive request complete */ > >>>> OPAL_THREAD_ADD_SIZE_T(&recvreq->req_bytes_received, > >>>> frag->rdma_length); > >>>> - if (recvreq->req_bytes_expected <= recvreq->req_bytes_received) { > >>>> + if (recvreq->req_recv.req_bytes_packed <= > >>>> recvreq->req_bytes_received) { > >>>> mca_pml_ob1_send_fin(recvreq->req_recv.req_base.req_proc, > >>>> bml_btl, > >>>> frag->rdma_hdr.hdr_rget.hdr_des, > >>>> > >>>> > >>>> > >>>> ----------------------------------------------------------------------------------- > >>>> This email message is for the sole use of the intended recipient(s) and > >>>> may contain > >>>> confidential information. Any unauthorized review, use, disclosure or > >>>> distribution > >>>> is prohibited. If you are not the intended recipient, please contact > >>>> the sender by > >>>> reply email and destroy all copies of the original message. > >>>> ----------------------------------------------------------------------------------- > >>> > >>>> _______________________________________________ > >>>> devel mailing list > >>>> de...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel