FWIW: I’m not planning on releasing tomorrow as we aren’t ready. We aren’t 
releasing with a bug as bad as threading on by default as we know we can’t 
really support that situation.

Nothing sacred about the release date - it’s just a target.

Frankly, I would even listen to the argument of flat-out disabling thread 
multiple vs forcing thread support, given the state of our support for thread 
multiple. Still, we have time to find a middle ground.

> On Nov 6, 2014, at 12:48 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
> On Nov 6, 2014, at 3:39 PM, Joshua Ladd <jladd.m...@gmail.com 
> <mailto:jladd.m...@gmail.com>> wrote:
> 
>> Thank you for taking the time to investigate this, Jeff. SC is a hectic and 
>> stressful time for everyone on this list with many deadlines looming. This 
>> bug isn't a priority for us, however, it seems to me that your original 
>> revert, the one that simply wants to disable threading by default (and for 
>> good reason), is a blocker for the 1.8.4 release tomorrow.
> 
> I would tend to agree.
> 
>> Therefore, I'm going to once again suggest that unless Nathan finds a 
>> solution by COB today, we live with the error that was made back in 1.8.1 
>> and punt on this until 1.9.
> 
> I would still disagree.  The performance bug must be fixed.
> 
> The release needs to be delayed, IMHO.
> 
>> The current state of the 1.8.4 prerelease is not acceptable with this 
>> commit. 
> 
> Yes, I got it.
> 
>> Thanks to Alina for bringing this issue to light.
>> 
>> Josh
>> 
>> 
>> On Thursday, November 6, 2014, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
>> wrote:
>> This thread digressed significantly from the original bug report; I did not 
>> realize that the discussion was revolving around the fact that 
>> MPI_THREAD_MULTIPLE no longer works *at all*.
>> 
>> So here's where we are:
>> 
>> 1. MPI_THREAD_MULTIPLE doesn't work, even if you --enable-mpi-thread-multiple
>> 
>> 2. It seems that 2c8087d10b10e0efea176db8907de2720a55454e and 
>> 09b867374e9618007b81bfaf674ec6df04548bed need to be reverted (in that order)
>> 
>> 3. That restores MPI_THREAD_MULTIPLE functionality (if you 
>> --enable-mpi-thread-multiple)
>> 
>> 4. However, this brings back the performance problem, too
>> 
>> I've looked at this all day so far, and am unfortunately just out of time -- 
>> I have some crushing SC deadlines that I *must* meet.  :-(
>> 
>> Nathan will be picking up where I left off later today to see if there's a 
>> simple way to fix just the performance issue for the non-THREAD_MULTIPLE 
>> cases.
>> 
>> 
>> 
>> On Nov 4, 2014, at 12:15 PM, Alina Sklarevich <ali...@dev.mellanox.co.il> 
>> wrote:
>> 
>>> Hi,
>>> 
>>> We observe a hang when running the multi-threading support test "latency.c" 
>>> (attached to this report), which uses MPI_THREAD_MULTIPLE.
>>> 
>>> The hang happens immediately at the begining of the test and is reproduced 
>>> in the v1.8 release branch.
>>> 
>>> The command line to reproduce the behavior is:
>>> 
>>> $ mpirun --map-by node --bind-to core -display-map -np 2 -mca pml ob1 -mca 
>>> btl tcp,self ./thread-tests-1.1/latency
>>> 
>>> The last commit with which the hang doesn't reproduce is:
>>> commit: e4d4266d9c69e
>>> 
>>> 
>>> 
>>> And problems begin after commit:
>>> 
>>> 
>>> 
>>> commit 09b867374e9618007b81bfaf674ec6df04548bed
>>> 
>>> Author: Jeff Squyres <jsquy...@cisco.com>
>>> 
>>> Date:   Fri Oct 31 12:42:50 2014 -0700
>>> 
>>> 
>>> 
>>>    Revert most of open-mpi/ompi@6ef938de3fa9ca0fed2c5bcb0736f65b0d8803af
>>> 
>>> 
>>> 
>>> Is this expected behavior? In other words, should we not expect any stable 
>>> release in the 1.8.x series to be able to use MPI_THREAD_MULTIPLE with even 
>>> the TCP and SM BTLs?
>>> 
>>> 
>>> 
>>> Please advise.
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Alina.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> <latency.c>_______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/11/16175.php
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/11/16243.php
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/11/16248.php 
>> <http://www.open-mpi.org/community/lists/devel/2014/11/16248.php>
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com <mailto:jsquy...@cisco.com>
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/ 
> <http://www.cisco.com/web/about/doing_business/legal/cri/>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16249.php 
> <http://www.open-mpi.org/community/lists/devel/2014/11/16249.php>

Reply via email to