Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??

2009-05-12 Thread Ralph Castain
Okay, I fixed this today toor21219 On May 11, 2009, at 11:27 PM, Anton Starikov wrote: Now there is another problem :) You can try oversubscribe node. At least by 1 task. If you hostfile and rank file limit you at N procs, you can ask mpirun for N+1 and it wil be not rejected.

Re: [OMPI users] strange bug

2009-05-12 Thread Anton Starikov
I will try to prepare test-case. -- Anton Starikov. On May 12, 2009, at 6:57 PM, Edgar Gabriel wrote: hm, so I am out of ideas. I created multiple variants of test- programs which did what you basically described, and they all passed and did not generate problems. I compiled the MUMPS

Re: [OMPI users] strange bug

2009-05-12 Thread Edgar Gabriel
hm, so I am out of ideas. I created multiple variants of test-programs which did what you basically described, and they all passed and did not generate problems. I compiled the MUMPS library and ran the tests that they have in the examples directory, and they all worked. Additionally, I

Re: [OMPI users] mpirun fails on remote applications

2009-05-12 Thread Micha Feigin
It is usually best to separate the cluster (mpi) interfaces from the internet interface. Usually on a dedicated cluster it is best to have a master node that is connected to the internet and client nodes that are connected to the master node (and if needed tunnel the connection through it to the

[OMPI users] Problem installing Dalton with OpenMPI over PelicanHPC

2009-05-12 Thread Silviu Groza
Dear all, I am trying to install Dalton quantum chemistry program with OpenMPI over PelicanHPC, but it ends with an error. PelicanHPC comes with both LAM and OpenMPI preinstalled. The version of OpenMPI is "OMPI_VERSION "1.2.7rc2"" (from version.h). The wrappers that I use are mpif77.openmpi and

Re: [OMPI users] Bug in return status of MPI_WAIT()?

2009-05-12 Thread Katz, Jacob
Ah... Thanks, Jeff. If the standard would explicitly mention that MPI::ERRORS_RETURN is useless with C++ binding, life would be a little easier... Jacob M. Katz | jacob.k...@intel.com | Work: +972-4-865-5726 | iNet: (8)-465-5726 -Original Message-

Re: [OMPI users] strange bug

2009-05-12 Thread Edgar Gabriel
I would say the probability is large that it is due to the recent 'fix'. I will try to create a testcase similar to what you suggested. Could you give us maybe some hints on which functionality of MUMPS you are using, or even share the code/ a code fragment? Thanks Edgar Jeff Squyres wrote:

Re: [OMPI users] mpirun fails on remote applications

2009-05-12 Thread Jeff Squyres
Open MPI requires that each MPI process be able to connect to any other MPI process in the same job with random TCP ports. It is usually easiest to leave the firewall off, or setup trust relationships between your cluster nodes. On May 12, 2009, at 6:04 AM, feng chen wrote: thanks a

Re: [OMPI users] Bug in return status of MPI_WAIT()?

2009-05-12 Thread Jeff Squyres
On May 12, 2009, at 9:37 AM, Jeff Squyres wrote: 2. The MPI_ERROR field in the status is specifically *not* set for MPI_TEST and MPI_WAIT. It *is* set for the multi-test/wait functions (e.g., MPI_TESTANY, MPI_WAITALL). Oops! Typo -- I should have said "(e.g., MPI_TESTALL, MPI_WAITALL)".

Re: [OMPI users] Bug in return status of MPI_WAIT()?

2009-05-12 Thread Jeff Squyres
Greetings Jacob; sorry for the slow reply. This is pretty subtle, but I think that your test is incorrect (I remember arguing about this a long time ago and eventually having another OMPI developer prove me wrong! :-) ). 1. You're setting MPI_ERRORS_RETURN, which, if you're using the C++

Re: [OMPI users] New warning messages in 1.3.2 (not present in1.2.8)

2009-05-12 Thread Matthieu Brucher
Thank you a lot for this. I've just checked everything again, recompiled my code as well (I'm using SCons so it detects that the headers and the libraries changed) and it works without a warning. Matthieu 2009/5/12 Jeff Squyres : > On May 12, 2009, at 8:17 AM, Matthieu

Re: [OMPI users] New warning messages in 1.3.2 (not present in1.2.8)

2009-05-12 Thread Matthieu Brucher
2009/5/12 Jeff Squyres : > Or it could be that you installed 1.3.2 over 1.2.8 -- some of the 1.2.8 > components that no longer exist in the 1.3 series are still in the > installation tree, but failed to open properly (unfortunately, libltdl gives > an incorrect "file not found"

Re: [OMPI users] New warning messages in 1.3.2 (not present in1.2.8)

2009-05-12 Thread Jeff Squyres
Or it could be that you installed 1.3.2 over 1.2.8 -- some of the 1.2.8 components that no longer exist in the 1.3 series are still in the installation tree, but failed to open properly (unfortunately, libltdl gives an incorrect "file not found" error message if it is unable to load a

Re: [OMPI users] New warning messages in 1.3.2 (not present in 1.2.8)

2009-05-12 Thread Ralph Castain
Looking at this output, I would say that the problem is you didn't recompile your code against 1.3.2. These are warnings about attempts to open components that were present in 1.2.8, but no longer exist in the 1.3.x series. On May 12, 2009, at 2:30 AM, Matthieu Brucher wrote: Hi, I've

Re: [OMPI users] Torque 2.2.1 problem with OpenMPI 1.2.5

2009-05-12 Thread Ralph Castain
The 1.2.x series has a bug in it when used with Torque. Simply do not include -machinefile on your mpiexec cmd line and it should work fine. It will automatically pickup the PBS_NODEFILE contents. On May 12, 2009, at 1:17 AM, wrote: Hi, I am using OFED 1.3

Re: [OMPI users] strange bug

2009-05-12 Thread Jeff Squyres
Can you send all the information listed here: http://www.open-mpi.org/community/help/ On May 11, 2009, at 10:03 PM, Anton Starikov wrote: By the way, this if fortran code, which uses F77 bindings. -- Anton Starikov. On May 12, 2009, at 3:06 AM, Anton Starikov wrote: > Due to

Re: [OMPI users] mpirun fails on remote applications

2009-05-12 Thread feng chen
thanks a lot. firewall it is.. It works with firewall's off, while that brings another questions from me. Is there anyway we can run mpirun while firwall 's on? If yes, how do we setup firewall or iptables? thank you From: Micha Feigin

Re: [OMPI users] mpirun fails on remote applications

2009-05-12 Thread Micha Feigin
On Tue, 12 May 2009 11:54:57 +0300 Lenny Verkhovsky wrote: > sounds like firewall problems to or from anfield04. > Lenny, > > On Tue, May 12, 2009 at 8:18 AM, feng chen wrote: > I'm having a similar problem, not sure if it's related (gave up for

Re: [OMPI users] mpirun fails on remote applications

2009-05-12 Thread Lenny Verkhovsky
sounds like firewall problems to or from anfield04. Lenny, On Tue, May 12, 2009 at 8:18 AM, feng chen wrote: > hi all, > > First of all,i'm new to openmpi. So i don't know much about mpi setting. > That's why i'm following manual and FAQ suggestions from the beginning. >

[OMPI users] New warning messages in 1.3.2 (not present in 1.2.8)

2009-05-12 Thread Matthieu Brucher
Hi, I've managed to use 1.3.2 (still not with LSF and InfiniPath, I start one step after another), but I have additional warnings that didn't show up in 1.2.8: [host-b:09180] mca: base: component_find: unable to open /home/brucher/lib/openmpi/mca_ras_dash_host: file not found (ignored)

[OMPI users] Torque 2.2.1 problem with OpenMPI 1.2.5

2009-05-12 Thread ansul.srivastava1
Hi, I am using OFED 1.3 version inw hich OPENMPI 1.2.5 is included , I have compiled with Intel and gcc . Problem is that during qsub i am not able to run the jobs but same time when i use mpiexec command it is working fine without any issue. here is my script file ; Please help me to

Re: [OMPI users] 1.3.1 -rf rankfile behaviour ??

2009-05-12 Thread Anton Starikov
Now there is another problem :) You can try oversubscribe node. At least by 1 task. If you hostfile and rank file limit you at N procs, you can ask mpirun for N+1 and it wil be not rejected. Although in reality there will be N tasks. So, if your hostfile limit is 4, then "mpirun -np 4" and

[OMPI users] mpirun fails on remote applications

2009-05-12 Thread feng chen
hi all, First of all,i'm new to openmpi. So i don't know much about mpi setting. That's why i'm following manual and FAQ suggestions from the beginning. Everything went well untile i try to run a pllication on a remote node by using 'mpirun -np' command. It just hanging there without doing