Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec

2009-11-10 Thread Paul Kapinos
Hi Ralph, Not at the moment - though I imagine we could create one. It is a tad tricky in that we allow multiple -x options on the cmd line, but we obviously can't do that with an envar. why not? export OMPI_Magic_Variavle="-x LD_LIBRARY_PATH -x PATH" cold be possible, or not? I can

Re: [OMPI users] an environment variable with same meaning than the-x option of mpiexec

2009-11-10 Thread Paul Kapinos
Hi Jeff, FWIW, environment variables prefixed with "OMPI_" will automatically be distributed out to processes. Of course, but saddingly the variable(s) we want to ditribute aren't "OMPI_" variable. Depending on your environment and launcher, your entire environment may be copied out

Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec

2009-11-10 Thread Ralph Castain
On Nov 10, 2009, at 2:48 AM, Paul Kapinos wrote: Hi Ralph, Not at the moment - though I imagine we could create one. It is a tad tricky in that we allow multiple -x options on the cmd line, but we obviously can't do that with an envar. why not? export OMPI_Magic_Variavle="-x

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-10 Thread Yogesh Aher
Thanks for the reply Pallab. Firewall is not an issue as I can passwordless-SSH to/from both machines. My problem is to deal with 32bit & 64bit architectures simultaneously (and not with different operating systems). Can it be possible through open-MPI??? Look forward to the solution! Thanks,

Re: [OMPI users] Openmpi on Heterogeneous environment

2009-11-10 Thread Jeff Squyres
Do you see any output from your executables? I.e., are you sure that it's running the "correct" executables? If so, do you know how far it's getting in its run before aborting? On Nov 10, 2009, at 7:36 AM, Yogesh Aher wrote: Thanks for the reply Pallab. Firewall is not an issue as I can

[OMPI users] ipo: warning #11009: file format not recognized for /Libraries_intel/openmpi/lib/libmpi.so

2009-11-10 Thread vasilis gkanis
Dear all, I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler. In order to compile openmpi I run configure with the following options: ./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90 OpenMpi compiled just fine, but when I am trying to compile and link my

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Ralph Castain
Creating a directory with such credentials sounds like a bug in SGE to me...perhaps an SGE config issue? Only thing you could do is tell OMPI to use some other directory as the root for its session dir tree - check "mpirun -h", or ompi_info for the required option. But I would first

[OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker
Hello, We've been having a lot of problems where openmpi jobs crash at startup because the call to lsb_launch fails (we have a ticket open with Platform about this). Is there a way to disable the lsb_launch startup mechanism at runtime and revert to ssh? It's easy enough to recompile without

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Thanks for your help Ralph, I'll double check that. As for the error message received, there might be some inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0" is the parent directory and "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions-eg@charlie_0/53199/0/0" is the

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Ralph Castain
What version of OMPI? On Nov 10, 2009, at 9:49 AM, Chris Walker wrote: Hello, We've been having a lot of problems where openmpi jobs crash at startup because the call to lsb_launch fails (we have a ticket open with Platform about this). Is there a way to disable the lsb_launch startup

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Hi, Am 10.11.2009 um 17:55 schrieb Eloi Gaudry: Thanks for your help Ralph, I'll double check that. As for the error message received, there might be some inconsistency: "/opt/sge/tmp/25.1.smp8.q/openmpi-sessions- eg@charlie_0" is the often /opt/sge is shared across the nodes, while the

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Thanks for your help Reuti, I'm using a nfs-shared directory (/opt/sge/tmp), exported from the master node to all others computing nodes. with for /etc/export on server (named moe.fft): /opt/sge 192.168.0.0/255.255.255.0(rw,sync,no_subtree_check) /etc/fstab on client:

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Ralph Castain
Just add plm = rsh to your default mca param file. You don't need to reconfigure or rebuild OMPI On Nov 10, 2009, at 10:16 AM, Chris Walker wrote: We have modules for both 1.3.2 and 1.3.3 (intel compilers) Chris On Tue, Nov 10, 2009 at 11:58 AM, Ralph Castain wrote:

Re: [OMPI users] disabling LSF integration at runtime

2009-11-10 Thread Chris Walker
Perfect! Thanks very much, Chris On Tue, Nov 10, 2009 at 12:22 PM, Ralph Castain wrote: > Just add > > plm = rsh > > to your default mca param file. > > You don't need to reconfigure or rebuild OMPI > > On Nov 10, 2009, at 10:16 AM, Chris Walker wrote: > >> We have modules

Re: [OMPI users] ipo: warning #11009: file format not recognized for /Libraries_intel/openmpi/lib/libmpi.so

2009-11-10 Thread Nifty Tom Mitchell
On Tue, Nov 10, 2009 at 03:44:59PM +0200, vasilis gkanis wrote: > > I am trying to compile openmpi-1.3.3 with intel Fortran and gcc compiler. > > In order to compile openmpi I run configure with the following options: > > ./configure --prefix=/Libraries/openmpi FC=ifort --enable-mpi-f90 > >

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 10.11.2009 um 18:20 schrieb Eloi Gaudry: Thanks for your help Reuti, I'm using a nfs-shared directory (/opt/sge/tmp), exported from the master node to all others computing nodes. It's higly advisable to have the "tmpdir" local on each node. When you use "cd $TMPDIR" in your jobscript,

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
Reuti, I'm using "tmpdir" as a shared directory that contains the session directories created during job submission, not for computing or local storage. Doesn't the session directory (i.e. job_id.queue_name) need to be shared among all computing nodes (at least the ones that would be used

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Hi, Am 10.11.2009 um 19:01 schrieb Eloi Gaudry: Reuti, I'm using "tmpdir" as a shared directory that contains the session directories created during job submission, not for computing or local storage. Doesn't the session directory (i.e. job_id.queue_name) need to be shared among all

[OMPI users] Coding help requested

2009-11-10 Thread amjad ali
Hi all. (sorry for duplication, if it is) I have to parallelize a CFD code using domain/grid/mesh partitioning among the processes. Before running, we do not know, (i) How many processes we will use ( np is unknown) (ii) A process will have how many neighbouring processes (my_nbrs = ?) (iii) How

[OMPI users] How do you get static linkage for Intel compiler libs for the orterun executable?

2009-11-10 Thread Blosch, Edwin L
I'm trying to build OpenMPI with Intel compilers, both static and dynamic libs, then move it to a system that does not have Intel compilers. I don't care about system libraries or OpenMPI loadable modules being dynamic, that's all fine. But I need the compiler libs to be statically linked

Re: [OMPI users] Coding help requested

2009-11-10 Thread Eugene Loh
amjad ali wrote: Hi all. (sorry for duplication, if it is) I have to parallelize a CFD code using domain/grid/mesh partitioning among the processes. Before running, we do not know, (i) How many processes we will use ( np is unknown) (ii) A process will have how many neighbouring processes

[OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Qing Pang
I'm having problem getting the mpirun "preload-binary" option to work. I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet cable. If I copy the executable to client nodes using scp, then do mpirun, everything works. But I really want to avoid the copying, so I tried the

Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Ralph Castain
It -should- work, but you need password-less ssh setup. See our FAQ for how to do that, if you are unfamiliar with it. On Nov 10, 2009, at 2:02 PM, Qing Pang wrote: I'm having problem getting the mpirun "preload-binary" option to work. I'm using ubutu8.10 with openmpi 1.3.3, nodes

[OMPI users] System hang-up on MPI_Reduce

2009-11-10 Thread Glembek Ondřej
Hi, I am using MPI_Reduce operation on 122880x400 matrix of doubles. The parallel job runs on 32 machines, each having different processor in terms of speed, but the architecture and OS is the same on all machines (x86_64). The task is a typical map-and-reduce, i.e. each of the processes

[OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Tom Rosmond
I want to run a number of MPI executables simultaneously in a PBS job. For example on my system I do 'cat $PBS_NODEFILE' and get a list like this: n04 n04 n04 n04 n06 n06 n06 n06 n07 n07 n07 n07 n09 n09 n09 n09 i.e, 16 processors on 4 nodes. from which I can parse into file(s) as desired. If I

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Ralph Castain
What version are you trying to do this with? Reason I ask: in 1.3.x, we introduced relative node syntax for specifying hosts to use. This would eliminate the need to create the hostfiles. You might do a "man orte_hosts" (assuming you installed the man pages) and see what it says. Ralph

Re: [OMPI users] System hang-up on MPI_Reduce

2009-11-10 Thread Ralph Castain
Yeah, that is "normal". It has to do with unexpected messages. When you have procs running at significantly different speeds, the various operations get far enough out of sync that the memory consumed by recvd messages not yet processed grows too large. Instead of sticking barriers into

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Hi Eloi, Am 10.11.2009 um 23:42 schrieb Eloi Gaudry: I followed your advice and switched to a local "tmpdir" instead of a share one. This solved the session directory issue, thanks for your help ! what user/group is no listed for the generated temporary directories (i.e. $TMPDIR)? --

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 10.11.2009 um 23:51 schrieb Reuti: Hi Eloi, Am 10.11.2009 um 23:42 schrieb Eloi Gaudry: I followed your advice and switched to a local "tmpdir" instead of a share one. This solved the session directory issue, thanks for your help ! what user/group is no listed for the generated

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 11.11.2009 um 00:03 schrieb Eloi Gaudry: The user/group used to generate the temporary directories was nobody/nogroup, when using a shared $tmpdir. Now that I'm using a local $tmpdir (one for each node, not distributed over nfs), the right credentials (i.e. my username/ groupname) are

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
To avoid misunderstandings: Am 11.11.2009 um 00:19 schrieb Eloi Gaudry: On any execution node, creating a subdirectory of /opt/sge/tmp (i.e. creating a session directory inside $TMPDIR) results in a new directory own by the user/group that submitted the job (not nobody/ nogroup). $TMPDIR

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Eloi Gaudry
This is what I did (create by hand /opt/sge/tmp/test on an execution host log as a regular cluster user). Eloi On 11/11/2009 00:26, Reuti wrote: To avoid misunderstandings: Am 11.11.2009 um 00:19 schrieb Eloi Gaudry: On any execution node, creating a subdirectory of /opt/sge/tmp (i.e.

Re: [OMPI users] [sge] tight-integration openmpi and sge: opal_os_dirpath_create failure

2009-11-10 Thread Reuti
Am 11.11.2009 um 00:29 schrieb Eloi Gaudry: This is what I did (create by hand /opt/sge/tmp/test on an execution host log as a regular cluster user). Then we end up where I started to think first, but I missed the implied default: can you export /opt/sge with "no_root_squash" and reload

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Tom Rosmond
Ralph, I am using 1.3.2, so the relative node syntax certainly seems the way to go. However, I seem to be missing something. On the 'orte_hosts' man page near the top is the simple example: mpirun -pernode -host +n1,+n2 ./app1 : -host +n3,+n4 ./app2 I set up my job to run on 4 nodes (4

Re: [OMPI users] running multiple executables under Torque/PBS PRO

2009-11-10 Thread Ralph Castain
You can use the relative host syntax, but you cannot use a "pernode" or "npernode" option when you have more than one application on the cmd line. You have to specify the number of procs for each application, as the error message says. :-) IIRC, the reason was that we couldn't decide on

[OMPI users] maximum value for count argument

2009-11-10 Thread Martin Siegert
Hi, I have a problem with sending/receiving large buffers when using openmpi (version 1.3.3), e.g., MPI_Allreduce(sbuf, rbuf, count, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD); with count=18000 (this problem does not appear to be unique for Allreduce, but occurs with Reduce, Bcats as well; maybe