On Apr 10, 2012, at 7:51 AM, TERRY DONTJE wrote:

> Fair enough sorry about the false report.  

No problem - it's a good reminder to all that we changed this policy. 
Previously, we allowed oversubscribe by default even on managed systems. This 
generated some significant concerns from sys admins who managed multi-tenant 
(i.e., shared node) systems as it caused obvious problems. So we now respect 
allocations from managed systems unless directed otherwise.

MTT setups probably require adjustment for tests like loop_spawn.

> 
> I sent you email about the other failures (final and MPI_Errhandler).
> 
> --td
> 
> On 4/10/2012 9:40 AM, Ralph Castain wrote:
>> 
>> I looked closer at the MTT output, Terry, and loop_spawn is actually 
>> behaving correctly. The problem is that (a) the test creates more children 
>> than allocated slots, and (b) the tests are being executed in a managed 
>> environment, and so we enforce the slot limit. The solution is to set the 
>> --oversubscribe flag so that ORTE knows it is okay to run more procs than 
>> allocated slots.
>> 
>> Set that and it will run just fine.
>> 
>> On Apr 10, 2012, at 4:44 AM, TERRY DONTJE wrote:
>> 
>>> Thanks Ralph the comm_join issue seems to be fix but the other issues 
>>> mentioned still seem to persist.  I'll look at this later today unless 
>>> someone else decides to fix them :-).
>>> 
>>> --td
>>> 
>>> On 4/9/2012 6:45 PM, Ralph Castain wrote:
>>>> 
>>>> Should all be fixed now.
>>>> 
>>>> On Apr 9, 2012, at 7:17 AM, TERRY DONTJE wrote:
>>>> 
>>>>> After looking at Oracles MTT results there seem to be a (some??) 
>>>>> regressions between r26240 and 26249 detected by the ibm and intel tests 
>>>>> suites.  An example of this is the failures in the comm_join, final and 
>>>>> loop_spawn tests of the ibm test suite as seen in 
>>>>> http://www.open-mpi.org/mtt/index.php?do_redir=2055.
>>>>> 
>>>>> Note, I've seen similar errors detected by IU runs too.
>>>>> 
>>>>> I'll look further into this but I thought I would post this just in case 
>>>>> someone else has seen this.
>>>>> -- 
>>>>> Terry D. Dontje | Principal Software Engineer
>>>>> Developer Tools Engineering | +1.781.442.2631
>>>>> Oracle - Performance Technologies
>>>>> 95 Network Drive, Burlington, MA 01803
>>>>> Email terry.don...@oracle.com
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> -- 
>>> Terry D. Dontje | Principal Software Engineer
>>> Developer Tools Engineering | +1.781.442.2631
>>> Oracle - Performance Technologies
>>> 95 Network Drive, Burlington, MA 01803
>>> Email terry.don...@oracle.com
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> -- 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to