From your ompi_info output, it looks like this is a slurm system - yes? 
Wouldn't really matter anyway as we run fine on a head node without an 
allocation, but worth clarifying.

What the message is indicating is a failure of the modex - we are missing an 
expected piece of data. I don't see anything obvious as the source of the 
problem - works fine for me on all my machines, including on front end of a 
slurm cluster.

Only possibly relevant thing I see is that this was built with PGI - any chance 
you could try a gcc based build? All my tests are done with gcc, so I'm 
wondering if PGI is the source of the trouble here.


On Jan 9, 2014, at 6:17 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> I've now seen this same failure mode on another Linux system.
> I forgot to mention before that the job is hung after issuing the error 
> message.
> Singleton runs fail in the same manner.
> 
> Both are front-end machines and perhaps that is related to this failure; for 
> instance expecting an allocation because of the batch system detected at 
> configure time.  However, I would have expected a more informative error 
> message for that case.
> 
> -Paul
> 
> 
> On Thu, Jan 9, 2014 at 5:03 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> Trying to run on the front-end of one of our production Linux systems I see 
> the following:
> 
> $ mpirun -mca btl sm,self -np 2 examples/ring_c'
> [cvrsvc01:17692] [[42051,1],0] ORTE_ERROR_LOG: Data for specified key not 
> found in file 
> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pgi-12.8/openmpi-1.7.4rc2r30168/orte/runtime/orte_globals.c
>  at line 505
> [cvrsvc01:17693] [[42051,1],1] ORTE_ERROR_LOG: Data for specified key not 
> found in file 
> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pgi-12.8/openmpi-1.7.4rc2r30168/orte/runtime/orte_globals.c
>  at line 505
> 
> The "ompi_info --all" output is attached.
> 
> Please let me know what MCA param(s) to set to collect any additional info 
> needed to track down the problem.
> 
> -Paul
> 
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> 
> 
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to