[OMPI users] My MPI build is broke, don't know why/how

2012-08-23 Thread Jim Kusznir
Hi all: I recently rebuilt my cluster from rocks 5 to rocks 6 (which is based on CentOS 6.2) using the official spec file and my build options as before. It all built successfully and all appeared good. That is, until one tried to use it. This is built with torque integration, and its run

Re: [OMPI users] Building openmpi with PGI 11.4: won't find torque??

2011-05-03 Thread Jim Kusznir
uot;configure -h" >> >> >> On May 2, 2011, at 6:22 PM, Jim Kusznir wrote: >> >>> Hi all: >>> >>> I'm trying to build openmpi 1.4.3 against PGI 11.4 on my Rocks 5.1 >>> system.  My "tried and true" build command for Ope

[OMPI users] Building openmpi with PGI 11.4: won't find torque??

2011-05-02 Thread Jim Kusznir
Hi all: I'm trying to build openmpi 1.4.3 against PGI 11.4 on my Rocks 5.1 system. My "tried and true" build command for OpenMPI is: CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 rpmbuild -bb --define 'install_in_opt 1' --define 'install_modulefile 1' --define 'modules_rpm_name environment-modules'

Re: [OMPI users] Compile problems with 1.3.2

2010-11-10 Thread Jim Kusznir
0' openmpi-1.4.3.spec --Jim On Mon, Jun 29, 2009 at 4:24 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > On Jun 29, 2009, at 7:18 PM, Jim Kusznir wrote: > >> That sounds good; I'm glad there are a variety of tools out there. >> >> However, this now brings me ba

Re: [OMPI users] Building OpenMPI 1.5.x

2010-11-02 Thread Jim Kusznir
e an open issue about exactly this with Red Hat.  I am awaiting guidance > from them to know how to fix it. > >    https://svn.open-mpi.org/trac/ompi/ticket/2611 > > The only workaround for the moment is to build from tarball, not RPM. > > > On Nov 2, 2010, at 12:47 PM,

[OMPI users] Building OpenMPI 1.5.x

2010-11-02 Thread Jim Kusznir
Hi all: I finally decided to rebuild openmpi on my cluster (last built when 1.3.2 was current). I have a ROCKS cluster, so I need to build RPMs to install accross the cluster rebuilds. Previously, I did so with the following command: rpmbuild -bb --define 'install_in_opt 1' --define

Re: [OMPI users] OpenMPI 1.4 RPM Spec file problem

2009-12-09 Thread Jim Kusznir
By the way, if I set build_all_in_one_rpm to 1, it works fine... --Jim On Wed, Dec 9, 2009 at 1:47 PM, Jim Kusznir <jkusz...@gmail.com> wrote: > Hi all: > > I'm trying to build openmpi-1.4 rpms using my normal (complex) rpm > build commands, but its failing.  I'm running into t

[OMPI users] OpenMPI 1.4 RPM Spec file problem

2009-12-09 Thread Jim Kusznir
Hi all: I'm trying to build openmpi-1.4 rpms using my normal (complex) rpm build commands, but its failing. I'm running into two errors: One (on gcc only): the D_FORTIFY_SOURCE build failure. I've had to move the if test "$using_gcc" = 0; then line down to after the RPM_OPT_FLAGS= that

Re: [OMPI users] Compile problems with 1.3.2

2009-06-29 Thread Jim Kusznir
in/orte-ps /opt/openmpi-gcc/1.3.2/etc/openmpi-default-hostfile /opt/openmpi-gcc/1.3.2/etc/openmpi-mca-params.conf /opt/openmpi-gcc/1.3.2/etc/openmpi-totalview.tcl Thanks! --Jim On Mon, Jun 29, 2009 at 2:28 PM, Eugene Loh<eugene@sun.com> wrote: > Jim Kusznir wrote: > >>

[OMPI users] Compile problems with 1.3.2

2009-06-29 Thread Jim Kusznir
Hi all: I'm trying to build and install openmpi-1.3.2 for my cluster using enviornment-modules. My build failed, but on something that I have no idea how to debug. Here's the relevent output: Making all in vtlib make[5]: Entering directory

Re: [OMPI users] OpenMPI 1.3.1 rpm build error

2009-02-23 Thread Jim Kusznir
m.sh > > It should build a trivial SRPM for you from the tarball. You'll likely need > to get the specfile, too, and put it in the same dir as buildrpm.sh. The > specfile is in the same SVN directory: > > > https://svn.open-mpi.org/source/xref/ompi_1.3/contrib/dist/linux/o

Re: [OMPI users] OpenMPI 1.3.1 rpm build error

2009-02-20 Thread Jim Kusznir
As long as I can still build the rpm for it and install it via rpm. I'm running it on a ROCKS cluster, so it needs to be an RPM to get pushed out to the compute nodes. --Jim On Fri, Feb 20, 2009 at 11:30 AM, Jeff Squyres <jsquy...@cisco.com> wrote: > On Feb 20, 2009, at 2:20 PM, Ji

Re: [OMPI users] OpenMPI 1.3.1 rpm build error

2009-02-20 Thread Jim Kusznir
d you try building one of the 1.3.1 nightly snapshot tarballs? I > *think* the problem you're seeing is a problem due to FORTIFY_SOURCE in the > VT code in 1.3 and should be fixed by now. > >http://www.open-mpi.org/nightly/v1.3/ > > > On Feb 19, 2009, at 12:00 PM, Jim Kuszn

Re: [OMPI users] [torqueusers] Job dies randomly, but only through torque

2008-05-29 Thread Jim Kusznir
run ("mpirun: killing job...") is *only* displayed if mpirun > receives a SIGINT or SIGTERM. So perhaps some other resource limit is > being reached...? > > Is there a way to have Torque log if it is killing a job for some > reason? > > > On May 27, 2008, at 7:0

Re: [OMPI users] [torqueusers] Job dies randomly, but only through torque

2008-05-27 Thread Jim Kusznir
Yep. Wall time is no where near violation (dies about 2 minutes into a 30 minute allocation). I did a ulimit -a through qsub and direct on the node (as the same user in both cases), and the results were identical (most items were unlimited). Any other ideas? --Jim On Tue, May 27, 2008 at 9:25

[OMPI users] Job dies randomly, but only through torque

2008-05-27 Thread Jim Kusznir
Hi all: I've got a problem with a users' MPI job. This code is in use on dozzens of clusters around the world, but for some reason, when run on my Rocks 4.3 cluster, it dies at random timesteps. The logs are quite unhelpful: [root@aeolus logs]# more 2047.aeolus.OU Warning: no access to tty

Re: [OMPI users] OpenMPI+PGI errors

2008-05-27 Thread Jim Kusznir
), and the user assures me its always dying well before the allowed walltime. Thanks! --Jim On Tue, May 20, 2008 at 1:23 PM, Jim Kusznir <jkusz...@gmail.com> wrote: > Hello all: > > I've got a user on our ROCKS 4.3 cluster that's having some strange > errors. I have other users

Re: [OMPI users] OpenMPI+PGI errors

2008-05-23 Thread Jim Kusznir
results. --Jim On Fri, May 23, 2008 at 11:54 AM, Jeff Squyres <jsquy...@cisco.com> wrote: > This may be a dumb question, but is there a chance that his job is > running beyond 30 minutes, and PBS/Torque/whatever is killing it? > > On May 20, 2008, at 4:23 PM, Jim Kusznir wro

Re: [OMPI users] More OpenMPI errors: how to debug?

2008-05-23 Thread Jim Kusznir
ked. I am thinking about making /opt a symlink across the cluster, but I'm not sure about all the implications therein... --Jim On Fri, May 23, 2008 at 12:07 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > On May 22, 2008, at 12:52 PM, Jim Kusznir wrote: > >> I installed openmpi 1.

[OMPI users] More OpenMPI errors: how to debug?

2008-05-22 Thread Jim Kusznir
Hi all: I installed openmpi 1.2.6 on my system, but now my users are complaining about even more errors. I'm getting this: [compute-0-23.local:26164] [NO-NAME] ORTE_ERROR_LOG: Not found in file runtime/orte_i nit_stage1.c at line 182

[OMPI users] OpenMPI+PGI errors

2008-05-20 Thread Jim Kusznir
Hello all: I've got a user on our ROCKS 4.3 cluster that's having some strange errors. I have other users using the cluster without any such errors reported, but this user also runs this code on other clusters without any problems, so I'm not really sure where the problem lies. They are getting

Re: [OMPI users] multi-compiler builds of OpenMPI (RPM)

2008-01-03 Thread Jim Kusznir
Thanks for the detailed responces! I've included some stuff inline below: On Jan 2, 2008 1:56 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > On Dec 31, 2007, at 12:50 AM, Jim Kusznir wrote: > > The rpm build errored out near the end with a missing file. It was > > trying

[OMPI users] multi-compiler builds of OpenMPI (RPM)

2007-12-31 Thread Jim Kusznir
Hi all: I'm trying to set up a ROCKS cluster (CentOS 4.5) with OpenMPI and GCC, PGI, and Intel compilers. My understanding is that OpenMPI must be compiled with each compiler. The result (or at least, the runtime libs) must be in .rpm format, as that is required by ROCKS compute node deployment

Re: [OMPI users] Suggestions on multi-compiler/multi-mpi build?

2007-11-30 Thread Jim Kusznir
t; then, I can send you some sample module scripts. I can also send you some > sample spec files for the rpms we use. > > > - Original Message - > From: "Jim Kusznir" <jkusz...@gmail.com> > To: <us...@open-mpi.org> > Sent: Thursday, November 15, 2007 11:5

Re: [OMPI users] Suggestions on multi-compiler/multi-mpi build?

2007-11-27 Thread Jim Kusznir
ebpage of software is built out of > cron from the module files: > http://cac.engin.umich.edu/resources/systems/nyxV2/software.html > > So we dont maintain software lists online, we just generate it > dynamically. > Modules is the admins best friend. > > Brock Palen > C

Re: [OMPI users] Compiling OpenMPI for i386 on a x86_64

2007-10-19 Thread Jim Kusznir
t determine size of LOGICAL > > Is this correct? We are feeding a 32-bit object file to be linked with > a 64-bit output executable file? When target is i386 shouldn't -m32 > -march=i386 need to be passed on to gfortran as well on above > instance, unless it's for negative testing? >

Re: [OMPI users] Compiling OpenMPI for i386 on a x86_64

2007-10-18 Thread Jim Kusznir
Attached is the requested info. There's not much here, though...it dies pretty early in. --Jim On 10/17/07, Jeff Squyres <jsquy...@cisco.com> wrote: > On Oct 17, 2007, at 12:35 PM, Jim Kusznir wrote: > > > checking if Fortran 90 compiler supports LOGICAL... yes > > che

[OMPI users] Compiling OpenMPI for i386 on a x86_64

2007-10-17 Thread Jim Kusznir
Hello: I'm trying to rebuild the CentOS OpenMPI rpm (to add torque support) on my x86_64 cluster. I was able to build 64-bit binaries fine, but CentOS wants the 32-bit libs and -devel portions installed as well for full compatability. In this area, I'm running into trouble. When I try and

Re: [OMPI users] OpenMPI and torque/maui -> crashing on MPI_Send()

2007-10-10 Thread Jim Kusznir
ng? > > > On Oct 4, 2007, at 8:36 PM, Jim Kusznir wrote: > > > Hi all: > > > > I'm having trouble getting torque/maui working with OpenMPI. > > > > Currently, I am getting hard failures when an MPI_Send is called. > > When > > run without qs