Re: [OMPI devel] orte_odls_base_default_launch_local()

2011-07-14 Thread Ralph Castain
Sorry for the delay - got tied up. This looks sane to me and should work. FWIW: I haven't seen any problem with comm_spawn_multiple. That code will only executes if the specific limit is hit, so I suspect that is an issue of scale and race conditions. On Jul 12, 2011, at 6:44 PM, Eugene Loh

Re: [OMPI devel] orte_odls_base_default_launch_local()

2011-07-12 Thread Ralph Castain
Well, yes and no. Yes - the value definitely needs to be computed in all cases since it is used later on. No - the way it is computed is only correct for that one usage. The later usage (in the block that starts with 0 < opal_sys_limits.num_procs)) needs a completely different value. The

[OMPI devel] orte_odls_base_default_launch_local()

2011-07-12 Thread Eugene Loh
The function orte_odls_base_default_launch_local() has a variable num_procs_alive that is basically initialized like this: if ( oversubscribed ) { ... } else { num_procs_alive = ...; } Specifically, if the "oversubscribed" test passes, the variable is not