Re: [OMPI users] seg fault with intel compiler

2012-06-05 Thread Edmund Sumbar
First of all, thanks to everyone who took the trouble to offer suggests. The solution seems to be to upgrade the Intel compilers. However, I'm not the cluster admin, so other crucial changes may have been implemented. For example, I know that ssh was reconfigured over the weekend (but that shouldn

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Gus Correa
On 06/01/2012 05:06 PM, Edmund Sumbar wrote: Thanks for the tips Gus. I'll definitely try some of these, particularly the nodes:ppn syntax, and report back. You can check for torque support with mpicc --showme It should show among other things -ltorque [if it has torque support] and -lrdmacm

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Edmund Sumbar
Thanks for the tips Gus. I'll definitely try some of these, particularly the nodes:ppn syntax, and report back. Right now, I'm upgrading the Intel Compilers and rebuilding Open MPI. On Fri, Jun 1, 2012 at 2:39 PM, Gus Correa wrote: > The [Torque/PBS] syntax '-l procs=48' is somewhat troublesom

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Gus Correa
Hi Edmund The [Torque/PBS] syntax '-l procs=48' is somewhat troublesome, and may not be understood by the scheduler [It doesn't work correctly with Maui, which is what we have here. I read people saying it works with pbs_sched and with Moab, but that's hearsay.] This issue comes back very often

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Edmund Sumbar
On Fri, Jun 1, 2012 at 8:09 AM, Jeff Squyres wrote: > It's been a lng time since I've run under PBS, so I don't remember if > your script's environment is copied out to the remote nodes where your > application actually runs. > > Can you verify that PATH and LD_LIBRARY_PATH are the same on al

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Jeff Squyres
On Jun 1, 2012, at 10:03 AM, Edmund Sumbar wrote: > I ran the following PBS script with "qsub -l procs=128 job.pbs". Environment > variables are set using the Environment Modules packages. > > echo $HOSTNAME > which mpiexec > module load library/openmpi/1.6-intel This *may* be the problem here.

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Edmund Sumbar
On Fri, Jun 1, 2012 at 5:00 AM, Jeff Squyres wrote: > Try running: > > which mpirun > ssh cl2n022 which mpirun > ssh cl2n010 which mpirun > > and > > ldd your_mpi_executable > ssh cl2n022 which mpirun > ssh cl2n010 which mpirun > > Compare the results and ensure that you're finding the same mpiru

Re: [OMPI users] seg fault with intel compiler

2012-06-01 Thread Jeff Squyres
Try running: which mpirun ssh cl2n022 which mpirun ssh cl2n010 which mpirun and ldd your_mpi_executable ssh cl2n022 which mpirun ssh cl2n010 which mpirun Compare the results and ensure that you're finding the same mpirun on all nodes, and the same libmpi.so on all nodes. There may well be ano

Re: [OMPI users] seg fault with intel compiler

2012-05-31 Thread Edmund Sumbar
Thanks for the tip Jeff, I wish it was that simple. Unfortunately, this is the only version installed. When I added --prefix to the mpiexec command line, I still got a seg fault, but without the backtrace. Oh well, I'll keep trying (compiler upgrade etc). [cl2n022:03057] *** Process received sign

Re: [OMPI users] seg fault with intel compiler

2012-05-31 Thread Jeff Squyres
This type of error usually means that you are inadvertently mixing versions of Open MPI (e.g., version A.B.C on one node and D.E.F on another node). Ensure that your paths are setup consistently and that you're getting both the same OMPI tools in your $path and the same libmpi.so in your $LD_LIB

[OMPI users] seg fault with intel compiler

2012-05-31 Thread Edmund Sumbar
Hi, I feel like a dope. I can't seem to successfully run the following simple test program (from Intel MPI distro) as a Torque batch job on a Cent OS 5.7 cluster with Open MPI 1.6 compiled using Intel compilers 12.1.0.233. If I comment out MPI_Get_processor_name(), it works. #include "mpi.h" #in