Re: [OMPI devel] direct launch problem with master

2016-10-31 Thread r...@open-mpi.org
Fixed in PR https://github.com/open-mpi/ompi/pull/2322 


> On Oct 31, 2016, at 1:20 AM, r...@open-mpi.org wrote:
> 
> I should hope bisecting would be a last resort. The simplest interim solution 
> is to set OMPI_MCA_routed=direct in your environment.
> 
> I’ll take a look at a more permanent solution in the morning.
> 
>> On Oct 30, 2016, at 6:33 PM, Pritchard Jr., Howard > > wrote:
>> 
>> Hi Folks,
>> 
>> While trying to solve a different problem, I optimistically tried to use
>> head-of –master to work on that problem.  Now I’ve found a new problem
>> with master when trying to do a direct launch with SLURM, srun:
>> 
>> [nid00012:09456] [[27960,0],0] ERROR: Failed to identify the local daemon's 
>> URI
>> [nid00012:09456] [[27960,0],0] ERROR: This is a fatal condition when the 
>> radix router
>> [nid00012:09456] [[27960,0],0] ERROR: has been selected - either select the 
>> unity router
>> [nid00012:09456] [[27960,0],0] ERROR: or ensure that the local daemon info 
>> is provided
>> [nid00012:09456] [[27960,0],0] ORTE_ERROR_LOG: Fatal in file 
>> base/ess_base_std_app.c at line 194
>> -
>> 
>> Any ideas what’s going on here?  With my configure pmix/s1 is selected for 
>> pmix component,
>> but death follows soon thereafter.  I’m confused why with direct launch Open 
>> MPI should
>> care at all about what the local deamon’s URI is.
>> 
>> Is there a way to avoid this problem when using direct launch?  I would do a 
>> git bisect 
>> but I’ve no time for such activities at the moment.
>> 
>> Thanks for any suggestions,
>> 
>> Howard
>> 
>> -- 
>> Howard Pritchard
>> HPC-DES
>> Los Alamos National Laboratory
>> 
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org 
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] direct launch problem with master

2016-10-31 Thread r...@open-mpi.org
I should hope bisecting would be a last resort. The simplest interim solution 
is to set OMPI_MCA_routed=direct in your environment.

I’ll take a look at a more permanent solution in the morning.

> On Oct 30, 2016, at 6:33 PM, Pritchard Jr., Howard  wrote:
> 
> Hi Folks,
> 
> While trying to solve a different problem, I optimistically tried to use
> head-of –master to work on that problem.  Now I’ve found a new problem
> with master when trying to do a direct launch with SLURM, srun:
> 
> [nid00012:09456] [[27960,0],0] ERROR: Failed to identify the local daemon's 
> URI
> [nid00012:09456] [[27960,0],0] ERROR: This is a fatal condition when the 
> radix router
> [nid00012:09456] [[27960,0],0] ERROR: has been selected - either select the 
> unity router
> [nid00012:09456] [[27960,0],0] ERROR: or ensure that the local daemon info is 
> provided
> [nid00012:09456] [[27960,0],0] ORTE_ERROR_LOG: Fatal in file 
> base/ess_base_std_app.c at line 194
> -
> 
> Any ideas what’s going on here?  With my configure pmix/s1 is selected for 
> pmix component,
> but death follows soon thereafter.  I’m confused why with direct launch Open 
> MPI should
> care at all about what the local deamon’s URI is.
> 
> Is there a way to avoid this problem when using direct launch?  I would do a 
> git bisect 
> but I’ve no time for such activities at the moment.
> 
> Thanks for any suggestions,
> 
> Howard
> 
> -- 
> Howard Pritchard
> HPC-DES
> Los Alamos National Laboratory
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel