First of all, thanks to everyone who took the trouble to offer suggests.
The solution seems to be to upgrade the Intel compilers. However, I'm not
the cluster admin, so other crucial changes may have been implemented. For
example, I know that ssh was reconfigured over the weekend (but that
shouldn
On 06/01/2012 05:06 PM, Edmund Sumbar wrote:
Thanks for the tips Gus. I'll definitely try some of these, particularly
the nodes:ppn syntax, and report back.
You can check for torque support with
mpicc --showme
It should show among other things -ltorque [if it
has torque support] and -lrdmacm
Thanks for the tips Gus. I'll definitely try some of these, particularly
the nodes:ppn syntax, and report back.
Right now, I'm upgrading the Intel Compilers and rebuilding Open MPI.
On Fri, Jun 1, 2012 at 2:39 PM, Gus Correa wrote:
> The [Torque/PBS] syntax '-l procs=48' is somewhat troublesom
Hi Edmund
The [Torque/PBS] syntax '-l procs=48' is somewhat troublesome,
and may not be understood by the scheduler [It doesn't
work correctly with Maui, which is what we have here. I read
people saying it works with pbs_sched and with Moab,
but that's hearsay.]
This issue comes back very often
On Fri, Jun 1, 2012 at 8:09 AM, Jeff Squyres wrote:
> It's been a lng time since I've run under PBS, so I don't remember if
> your script's environment is copied out to the remote nodes where your
> application actually runs.
>
> Can you verify that PATH and LD_LIBRARY_PATH are the same on al
On Jun 1, 2012, at 10:03 AM, Edmund Sumbar wrote:
> I ran the following PBS script with "qsub -l procs=128 job.pbs". Environment
> variables are set using the Environment Modules packages.
>
> echo $HOSTNAME
> which mpiexec
> module load library/openmpi/1.6-intel
This *may* be the problem here.
On Fri, Jun 1, 2012 at 5:00 AM, Jeff Squyres wrote:
> Try running:
>
> which mpirun
> ssh cl2n022 which mpirun
> ssh cl2n010 which mpirun
>
> and
>
> ldd your_mpi_executable
> ssh cl2n022 which mpirun
> ssh cl2n010 which mpirun
>
> Compare the results and ensure that you're finding the same mpiru
Try running:
which mpirun
ssh cl2n022 which mpirun
ssh cl2n010 which mpirun
and
ldd your_mpi_executable
ssh cl2n022 which mpirun
ssh cl2n010 which mpirun
Compare the results and ensure that you're finding the same mpirun on all
nodes, and the same libmpi.so on all nodes. There may well be ano
Thanks for the tip Jeff,
I wish it was that simple. Unfortunately, this is the only version
installed. When I added --prefix to the mpiexec command line, I still got a
seg fault, but without the backtrace. Oh well, I'll keep trying (compiler
upgrade etc).
[cl2n022:03057] *** Process received sign
This type of error usually means that you are inadvertently mixing versions of
Open MPI (e.g., version A.B.C on one node and D.E.F on another node).
Ensure that your paths are setup consistently and that you're getting both the
same OMPI tools in your $path and the same libmpi.so in your $LD_LIB
Hi,
I feel like a dope. I can't seem to successfully run the following simple
test program (from Intel MPI distro) as a Torque batch job on a Cent OS 5.7
cluster with Open MPI 1.6 compiled using Intel compilers 12.1.0.233.
If I comment out MPI_Get_processor_name(), it works.
#include "mpi.h"
#in
11 matches
Mail list logo