Thank you for taking the time to investigate this, Jeff. SC is a hectic and stressful time for everyone on this list with many deadlines looming. This bug isn't a priority for us, however, it seems to me that your original revert, the one that simply wants to disable threading by default (and for good reason), is a blocker for the 1.8.4 release tomorrow. Therefore, I'm going to once again suggest that unless Nathan finds a solution by COB today, we live with the error that was made back in 1.8.1 and punt on this until 1.9. The current state of the 1.8.4 prerelease is not acceptable with this commit.
Thanks to Alina for bringing this issue to light. Josh On Thursday, November 6, 2014, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > This thread digressed significantly from the original bug report; I did > not realize that the discussion was revolving around the fact that > MPI_THREAD_MULTIPLE no longer works *at all*. > > So here's where we are: > > 1. MPI_THREAD_MULTIPLE doesn't work, even if you > --enable-mpi-thread-multiple > > 2. It seems that 2c8087d10b10e0efea176db8907de2720a55454e and > 09b867374e9618007b81bfaf674ec6df04548bed need to be reverted (in that order) > > 3. That restores MPI_THREAD_MULTIPLE functionality (if you > --enable-mpi-thread-multiple) > > 4. However, this brings back the performance problem, too > > I've looked at this all day so far, and am unfortunately just out of time > -- I have some crushing SC deadlines that I *must* meet. :-( > > Nathan will be picking up where I left off later today to see if there's a > simple way to fix just the performance issue for the non-THREAD_MULTIPLE > cases. > > > > On Nov 4, 2014, at 12:15 PM, Alina Sklarevich <ali...@dev.mellanox.co.il > <javascript:;>> wrote: > > > Hi, > > > > We observe a hang when running the multi-threading support test > "latency.c" (attached to this report), which uses MPI_THREAD_MULTIPLE. > > > > The hang happens immediately at the begining of the test and is > reproduced in the v1.8 release branch. > > > > The command line to reproduce the behavior is: > > > > $ mpirun --map-by node --bind-to core -display-map -np 2 -mca pml ob1 > -mca btl tcp,self ./thread-tests-1.1/latency > > > > The last commit with which the hang doesn't reproduce is: > > commit: e4d4266d9c69e > > > > > > > > And problems begin after commit: > > > > > > > > commit 09b867374e9618007b81bfaf674ec6df04548bed > > > > Author: Jeff Squyres <jsquy...@cisco.com <javascript:;>> > > > > Date: Fri Oct 31 12:42:50 2014 -0700 > > > > > > > > Revert most of open-mpi/ompi@6ef938de3fa9ca0fed2c5bcb0736f65b0d8803af > > > > > > > > Is this expected behavior? In other words, should we not expect any > stable release in the 1.8.x series to be able to use MPI_THREAD_MULTIPLE > with even the TCP and SM BTLs? > > > > > > > > Please advise. > > > > > > > > Thanks, > > > > Alina. > > > > > > > > > > > > <latency.c>_______________________________________________ > > devel mailing list > > de...@open-mpi.org <javascript:;> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16175.php > > > -- > Jeff Squyres > jsquy...@cisco.com <javascript:;> > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org <javascript:;> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16243.php >