Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
Hi,

Thanks for this guys. I think I might have two MPI implementations
installed because 'locate mpirun' gives (see bold lines) :
-
/etc/alternatives/mpirun
/etc/alternatives/mpirun.1.gz
*/home/djordje/Build_WRF/LIBRARIES/mpich/bin/mpirun*
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/intel/
4.1.1.036/linux-x86_64/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/intel/
4.1.1.036/linux-x86_64/bin64/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/intel/
4.1.1.036/linux-x86_64/ia32/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/intel/
4.1.1.036/linux-x86_64/intel64/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/openmpi/1.4.3/linux-x86_64-2.3.4/gnu4.5/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/openmpi/1.4.3/linux-x86_64-2.3.4/gnu4.5/share/man/man1/mpirun.1
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/openmpi/1.6.4/linux-x86_64-2.3.4/gnu4.6/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/openmpi/1.6.4/linux-x86_64-2.3.4/gnu4.6/share/man/man1/mpirun.1
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/bin/mpirun.mpich
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/bin/mpirun.mpich2
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/ia32/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/ia32/bin/mpirun.mpich
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/ia32/bin/mpirun.mpich2
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/ia32/lib/linux_amd64/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/ia32/lib/linux_ia32/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/lib/linux_amd64/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/lib/linux_ia32/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.2.0.0/linux64_2.6-x86-glibc_2.3.4/share/man/man1/mpirun.1.gz
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/bin/mpirun.mpich
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/bin/mpirun.mpich2
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/ia32/bin/mpirun
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/ia32/bin/mpirun.mpich
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/ia32/bin/mpirun.mpich2
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/ia32/lib/linux_amd64/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/ia32/lib/linux_ia32/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/lib/linux_amd64/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/lib/linux_ia32/libmpirun.so
/home/djordje/StarCCM/Install/STAR-CCM+8.06.007/mpi/platform/
8.3.0.2/linux64_2.6-x86-glibc_2.3.4/share/man/man1/mpirun.1.gz
*/usr/bin/mpirun*
/usr/bin/mpirun.openmpi
/usr/lib/openmpi/include/openmpi/ompi/runtime/mpiruntime.h
/usr/share/man/man1/mpirun.1.gz
/usr/share/man/man1/mpirun.openmpi.1.gz
/var/lib/dpkg/alternatives/mpirun
-
This is a single machine. I actually just got it... another user used it
for 1-2 years.

Is this a possible cause of the problem?

Regards,
Djordje


On Mon, Apr 14, 2014 at 7:06 PM, Gus Correa  wrote:

> Apologies for stirring even more the confusion by mispelling
> "Open MPI" as "OpenMPI".
> "OMPI" doesn't help either, because all OpenMP environment
> variables and directives start with "OMP".
> Maybe associating the names to
> "message passing" vs. "threads" would help?
>
> Djordje:
>
> 'which mpif90' etc show everything in /usr/bin.
> So, very likely they were installed from packages
> (yum, apt-get, rpm ...),right?
> Have you tried something like
> "yum list |grep mpi"
> to see what you have?
>
> As Dave, Jeff and Tom said, this may be a mixup of different
> MPI implementations at compilation (mpicc mpif90) and runtime (mpirun).
> That is common, you may have different MPI implementations installed.
>
> Other possibilities that may tell what MPI you hav

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa

Apologies for stirring even more the confusion by mispelling
"Open MPI" as "OpenMPI".
"OMPI" doesn't help either, because all OpenMP environment
variables and directives start with "OMP".
Maybe associating the names to
"message passing" vs. "threads" would help?

Djordje:

'which mpif90' etc show everything in /usr/bin.
So, very likely they were installed from packages
(yum, apt-get, rpm ...),right?
Have you tried something like
"yum list |grep mpi"
to see what you have?

As Dave, Jeff and Tom said, this may be a mixup of different
MPI implementations at compilation (mpicc mpif90) and runtime (mpirun).
That is common, you may have different MPI implementations installed.

Other possibilities that may tell what MPI you have:

mpirun --version
mpif90 --show
mpicc --show

Yet another:

locate mpirun
locate mpif90
locate mpicc

The ldd didn't show any MPI libraries, maybe they are static libraries.

An alternative is to install Open MPI from source,
and put it in a non-system directory
(not /usr/bin, not /usr/local/bin, etc).

Is this a single machine or a cluster?
Or perhaps a set of PCs that you have access to?
If it is a cluster, do you have access to a filesystem that is
shared across the cluster?
On clusters typically /home is shared, often via NFS.

Gus Correa

On 04/14/2014 05:15 PM, Jeff Squyres (jsquyres) wrote:

Maybe we should rename OpenMP to be something less confusing --
perhaps something totally unrelated, perhaps even non-sensical.
That'll end lots of confusion!

My vote: OpenMP --> SharkBook

It's got a ring to it, doesn't it?  And it sounds fearsome!



On Apr 14, 2014, at 5:04 PM, "Elken, Tom"  wrote:


That’s OK.  Many of us make that mistake, though often as a typo.
One thing that helps is that the correct spelling of Open MPI has a space in it,

but OpenMP does not.

If not aware what OpenMP is, here is a link: http://openmp.org/wp/

What makes it more confusing is that more and more apps.

offer the option of running in a hybrid mode, such as WRF,
with OpenMP threads running over MPI ranks with the same executable.
And sometimes that MPI is Open MPI.


Cheers,
-Tom

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Djordje Romanic
Sent: Monday, April 14, 2014 1:28 PM
To: Open MPI Users
Subject: Re: [OMPI users] mpirun runs in serial even I set np to several 
processors

OK guys... Thanks for all this info. Frankly, I didn't know these diferences 
between OpenMP and OpenMPI. The commands:
which mpirun
which mpif90
which mpicc
give,
/usr/bin/mpirun
/usr/bin/mpif90
/usr/bin/mpicc
respectively.

A tutorial on how to compile WRF 
(http://www.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php) provides 
a test program to test MPI. I ran the program and it gave me the output of 
successful run, which is:
-
C function called by Fortran
Values are xx = 2.00 and ii = 1
status = 2
SUCCESS test 2 fortran + c + netcdf + mpi
-
It uses mpif90 and mpicc for compiling. Below is the output of 'ldd ./wrf.exe':


 linux-vdso.so.1 =>  (0x7fff584e7000)
 libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x7f4d160ab000)
 libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 
(0x7f4d15d94000)
 libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7f4d15a97000)
 libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x7f4d15881000)
 libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f4d154c1000)
 /lib64/ld-linux-x86-64.so.2 (0x7f4d162e8000)
 libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 
(0x7f4d1528a000)



On Mon, Apr 14, 2014 at 4:09 PM, Gus Correa  wrote:
Djordje

Your WRF configure file seems to use mpif90 and mpicc (line 115 & following).
In addition, it also seems to have DISABLED OpenMP (NO TRAILING "I")
(lines 109-111, where OpenMP stuff is commented out).
So, it looks like to me your intent was to compile with MPI.

Whether it is THIS MPI (OpenMPI) or another MPI (say MPICH, or MVAPICH,
or Intel MPI, or Cray, or ...) only your environment can tell.

What do you get from these commands:

which mpirun
which mpif90
which mpicc

I never built WRF here (but other people here use it).
Which input do you provide to the command that generates the configure
script that you sent before?
Maybe the full command line will shed some light on the problem.


I hope this helps,
Gus Correa


On 04/14/2014 03:11 PM, Djordje Romanic wrote:
to get help :)



On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic mailto:djord...@gmail.com>> wrote:

 Yes, but I was hoping to get. :)


 On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres)
 mailto:jsquy...@cisco.com>> wrote:

 If you didn't use Open MPI, then this is the wrong mailing list
 for you.  :-)

 (this is the Open MPI users' support mailing list)


 On Apr 14, 2014, at 2:58 PM, Djordje Romanic mailto:djord...@gmail.com>> wrote:

  

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Jeff Squyres (jsquyres)
Maybe we should rename OpenMP to be something less confusing -- perhaps 
something totally unrelated, perhaps even non-sensical.  That'll end lots of 
confusion!

My vote: OpenMP --> SharkBook

It's got a ring to it, doesn't it?  And it sounds fearsome!



On Apr 14, 2014, at 5:04 PM, "Elken, Tom"  wrote:

> That’s OK.  Many of us make that mistake, though often as a typo.
> One thing that helps is that the correct spelling of Open MPI has a space in 
> it, but OpenMP does not.
> If not aware what OpenMP is, here is a link: http://openmp.org/wp/
>  
> What makes it more confusing is that more and more apps. offer the option of 
> running in a hybrid mode, such as WRF, with OpenMP threads running over MPI 
> ranks with the same executable.  And sometimes that MPI is Open MPI.
>  
> Cheers,
> -Tom
>  
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Djordje Romanic
> Sent: Monday, April 14, 2014 1:28 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] mpirun runs in serial even I set np to several 
> processors
>  
> OK guys... Thanks for all this info. Frankly, I didn't know these diferences 
> between OpenMP and OpenMPI. The commands: 
> which mpirun
> which mpif90
> which mpicc
> give,
> /usr/bin/mpirun
> /usr/bin/mpif90
> /usr/bin/mpicc
> respectively.
> 
> A tutorial on how to compile WRF 
> (http://www.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php) 
> provides a test program to test MPI. I ran the program and it gave me the 
> output of successful run, which is: 
> -
> C function called by Fortran
> Values are xx = 2.00 and ii = 1
> status = 2
> SUCCESS test 2 fortran + c + netcdf + mpi
> -
> It uses mpif90 and mpicc for compiling. Below is the output of 'ldd 
> ./wrf.exe': 
> 
> 
> linux-vdso.so.1 =>  (0x7fff584e7000)
> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
> (0x7f4d160ab000)
> libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 
> (0x7f4d15d94000)
> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7f4d15a97000)
> libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x7f4d15881000)
> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f4d154c1000)
> /lib64/ld-linux-x86-64.so.2 (0x7f4d162e8000)
> libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 
> (0x7f4d1528a000)
> 
>  
> 
> On Mon, Apr 14, 2014 at 4:09 PM, Gus Correa  wrote:
> Djordje
> 
> Your WRF configure file seems to use mpif90 and mpicc (line 115 & following).
> In addition, it also seems to have DISABLED OpenMP (NO TRAILING "I")
> (lines 109-111, where OpenMP stuff is commented out).
> So, it looks like to me your intent was to compile with MPI.
> 
> Whether it is THIS MPI (OpenMPI) or another MPI (say MPICH, or MVAPICH,
> or Intel MPI, or Cray, or ...) only your environment can tell.
> 
> What do you get from these commands:
> 
> which mpirun
> which mpif90
> which mpicc
> 
> I never built WRF here (but other people here use it).
> Which input do you provide to the command that generates the configure
> script that you sent before?
> Maybe the full command line will shed some light on the problem.
> 
> 
> I hope this helps,
> Gus Correa
> 
> 
> On 04/14/2014 03:11 PM, Djordje Romanic wrote:
> to get help :)
> 
> 
> 
> On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic  > wrote:
> 
> Yes, but I was hoping to get. :)
> 
> 
> On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres)
> mailto:jsquy...@cisco.com>> wrote:
> 
> If you didn't use Open MPI, then this is the wrong mailing list
> for you.  :-)
> 
> (this is the Open MPI users' support mailing list)
> 
> 
> On Apr 14, 2014, at 2:58 PM, Djordje Romanic  > wrote:
> 
>  > I didn't use OpenMPI.
>  >
>  >
>  > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)
> mailto:jsquy...@cisco.com>> wrote:
>  > This can also happen when you compile your application with
> one MPI implementation (e.g., Open MPI), but then mistakenly use
> the "mpirun" (or "mpiexec") from a different MPI implementation
> (e.g., MPICH).
>  >
>  >
>  > On Apr 14, 2014, at 2:32 PM, Djordje Romanic
> mailto:djord...@gmail.com>> wrote:
>  >
>  > > I compiled it with: x86_64 Linux, gfortran compiler with
> gcc   (dmpar). dmpar - distributed memory option.
>  > >
>  > > Attached is the self-generated configuration file. The
> architecture specification settings start at line 107. I didn't
> use Open MPI (shared memory option).
>  > >
>  > >
>  > > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)
> mailto:dgood...@cisco.com>> wrote:
>  > > On Apr 14, 2014, at 12:15 PM, Djordje Romanic
> mailto:d

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Elken, Tom
That’s OK.  Many of us make that mistake, though often as a typo.
One thing that helps is that the correct spelling of Open MPI has a space in 
it, but OpenMP does not.
If not aware what OpenMP is, here is a link: http://openmp.org/wp/

What makes it more confusing is that more and more apps. offer the option of 
running in a hybrid mode, such as WRF, with OpenMP threads running over MPI 
ranks with the same executable.  And sometimes that MPI is Open MPI.

Cheers,
-Tom

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Djordje Romanic
Sent: Monday, April 14, 2014 1:28 PM
To: Open MPI Users
Subject: Re: [OMPI users] mpirun runs in serial even I set np to several 
processors

OK guys... Thanks for all this info. Frankly, I didn't know these diferences 
between OpenMP and OpenMPI. The commands:
which mpirun
which mpif90
which mpicc
give,
/usr/bin/mpirun
/usr/bin/mpif90
/usr/bin/mpicc
respectively.
A tutorial on how to compile WRF 
(http://www.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php) provides 
a test program to test MPI. I ran the program and it gave me the output of 
successful run, which is:
-
C function called by Fortran
Values are xx = 2.00 and ii = 1
status = 2
SUCCESS test 2 fortran + c + netcdf + mpi
-
It uses mpif90 and mpicc for compiling. Below is the output of 'ldd ./wrf.exe':

linux-vdso.so.1 =>  (0x7fff584e7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 
(0x7f4d160ab000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 
(0x7f4d15d94000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7f4d15a97000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x7f4d15881000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f4d154c1000)
/lib64/ld-linux-x86-64.so.2 (0x7f4d162e8000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 
(0x7f4d1528a000)

On Mon, Apr 14, 2014 at 4:09 PM, Gus Correa 
mailto:g...@ldeo.columbia.edu>> wrote:
Djordje

Your WRF configure file seems to use mpif90 and mpicc (line 115 & following).
In addition, it also seems to have DISABLED OpenMP (NO TRAILING "I")
(lines 109-111, where OpenMP stuff is commented out).
So, it looks like to me your intent was to compile with MPI.

Whether it is THIS MPI (OpenMPI) or another MPI (say MPICH, or MVAPICH,
or Intel MPI, or Cray, or ...) only your environment can tell.

What do you get from these commands:

which mpirun
which mpif90
which mpicc

I never built WRF here (but other people here use it).
Which input do you provide to the command that generates the configure
script that you sent before?
Maybe the full command line will shed some light on the problem.


I hope this helps,
Gus Correa

On 04/14/2014 03:11 PM, Djordje Romanic wrote:
to get help :)



On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic 
mailto:djord...@gmail.com>
>> wrote:

Yes, but I was hoping to get. :)


On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres)
mailto:jsquy...@cisco.com> 
>> wrote:

If you didn't use Open MPI, then this is the wrong mailing list
for you.  :-)

(this is the Open MPI users' support mailing list)


On Apr 14, 2014, at 2:58 PM, Djordje Romanic 
mailto:djord...@gmail.com>
>> wrote:

 > I didn't use OpenMPI.
 >
 >
 > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)
mailto:jsquy...@cisco.com> 
>> wrote:
 > This can also happen when you compile your application with
one MPI implementation (e.g., Open MPI), but then mistakenly use
the "mpirun" (or "mpiexec") from a different MPI implementation
(e.g., MPICH).
 >
 >
 > On Apr 14, 2014, at 2:32 PM, Djordje Romanic
mailto:djord...@gmail.com> 
>> wrote:
 >
 > > I compiled it with: x86_64 Linux, gfortran compiler with
gcc   (dmpar). dmpar - distributed memory option.
 > >
 > > Attached is the self-generated configuration file. The
architecture specification settings start at line 107. I didn't
use Open MPI (shared memory option).
 > >
 > >
 > > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)
mailto:dgood...@cisco.com> 
>> wrote:
 > > On Apr 14, 2014, at 12:15 PM, Djordje Romanic
mailto:djord...@gmail.com> 
>> wrote:
 > >
 > > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
 > > > 

Re: [OMPI users] openmpi-1.7.4/1.8 .0 problem with intel/mpi_sizeof

2014-04-14 Thread Jeff Squyres (jsquyres)
Yes, this is a bug.  Doh!  

Looks like we fixed it for one case, but missed another case.  :-(  

I've filed https://svn.open-mpi.org/trac/ompi/ticket/4519, and will fix this 
shortly.


On Apr 14, 2014, at 4:11 AM, Luis Kornblueh  
wrote:

> Dear all,
> 
> the attached mympi_test.f90 does not compile with intel and OpenMPI Version 
> 1.7.4, apparently it also does not compile with 1.8.0.
> 
> The Intel Compiler version is 14.0.2.
> 
> tmp/ifortjKG1cP.o: In function `MAIN__':
> mympi_test.f90:(.text+0x90): undefined reference to `mpi_sizeof0di4_'
> 
> This is very similar to an error reported for older versions 1.4.x and 1.5x 
> for the the Portland Group compiler:
> 
> https://www.open-mpi.org/community/lists/devel/2010/09/8443.php
> 
> Obviously this got fixed with version 1.6.*, and this version is working with 
> the intel compiler as well.
> 
> Cheerio,
> Luis
> -- 
> \\
> (-0^0-)
> --oOO--(_)--OOo-
> 
> Luis Kornblueh   Tel. : +49-40-41173289
> Max-Planck-Institute for Meteorology Fax. : +49-40-41173298
> Bundesstr. 53
> D-20146 Hamburg   Email: luis.kornbl...@zmaw.de
> Federal Republic of Germany
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
OK guys... Thanks for all this info. Frankly, I didn't know these
diferences between OpenMP and OpenMPI. The commands:
which mpirun
which mpif90
which mpicc
give,
/usr/bin/mpirun
/usr/bin/mpif90
/usr/bin/mpicc
respectively.

A tutorial on how to compile WRF (
http://www.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php)
provides a test program to test MPI. I ran the program and it gave me the
output of successful run, which is:
-
C function called by Fortran
Values are xx = 2.00 and ii = 1
status = 2
SUCCESS test 2 fortran + c + netcdf + mpi
-
It uses mpif90 and mpicc for compiling. Below is the output of 'ldd
./wrf.exe':


linux-vdso.so.1 =>  (0x7fff584e7000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x7f4d160ab000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3
(0x7f4d15d94000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x7f4d15a97000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
(0x7f4d15881000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f4d154c1000)
/lib64/ld-linux-x86-64.so.2 (0x7f4d162e8000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0
(0x7f4d1528a000)



On Mon, Apr 14, 2014 at 4:09 PM, Gus Correa  wrote:

> Djordje
>
> Your WRF configure file seems to use mpif90 and mpicc (line 115 &
> following).
> In addition, it also seems to have DISABLED OpenMP (NO TRAILING "I")
> (lines 109-111, where OpenMP stuff is commented out).
> So, it looks like to me your intent was to compile with MPI.
>
> Whether it is THIS MPI (OpenMPI) or another MPI (say MPICH, or MVAPICH,
> or Intel MPI, or Cray, or ...) only your environment can tell.
>
> What do you get from these commands:
>
> which mpirun
> which mpif90
> which mpicc
>
> I never built WRF here (but other people here use it).
> Which input do you provide to the command that generates the configure
> script that you sent before?
> Maybe the full command line will shed some light on the problem.
>
>
> I hope this helps,
> Gus Correa
>
>
> On 04/14/2014 03:11 PM, Djordje Romanic wrote:
>
>> to get help :)
>>
>>
>>
>> On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic > > wrote:
>>
>> Yes, but I was hoping to get. :)
>>
>>
>> On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres)
>> mailto:jsquy...@cisco.com>> wrote:
>>
>> If you didn't use Open MPI, then this is the wrong mailing list
>> for you.  :-)
>>
>> (this is the Open MPI users' support mailing list)
>>
>>
>> On Apr 14, 2014, at 2:58 PM, Djordje Romanic > > wrote:
>>
>>  > I didn't use OpenMPI.
>>  >
>>  >
>>  > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)
>> mailto:jsquy...@cisco.com>> wrote:
>>  > This can also happen when you compile your application with
>> one MPI implementation (e.g., Open MPI), but then mistakenly use
>> the "mpirun" (or "mpiexec") from a different MPI implementation
>> (e.g., MPICH).
>>  >
>>  >
>>  > On Apr 14, 2014, at 2:32 PM, Djordje Romanic
>> mailto:djord...@gmail.com>> wrote:
>>  >
>>  > > I compiled it with: x86_64 Linux, gfortran compiler with
>> gcc   (dmpar). dmpar - distributed memory option.
>>  > >
>>  > > Attached is the self-generated configuration file. The
>> architecture specification settings start at line 107. I didn't
>> use Open MPI (shared memory option).
>>  > >
>>  > >
>>  > > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)
>> mailto:dgood...@cisco.com>> wrote:
>>  > > On Apr 14, 2014, at 12:15 PM, Djordje Romanic
>> mailto:djord...@gmail.com>> wrote:
>>  > >
>>  > > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
>>  > > > -
>>  > > >  starting wrf task0  of1
>>  > > >  starting wrf task0  of1
>>  > > >  starting wrf task0  of1
>>  > > >  starting wrf task0  of1
>>  > > > -
>>  > > > This indicates that it is not using 4 processors, but 1.
>>  > > >
>>  > > > Any idea what might be the problem?
>>  > >
>>  > > It could be that you compiled WRF with a different MPI
>> implementation than you are using to run it (e.g., MPICH vs.
>> Open MPI).
>>  > >
>>  > > -Dave
>>  > >
>>  > > ___
>>  > > users mailing list
>>  > > us...@open-mpi.org 
>>
>>  > > http://www.open

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-14 Thread Rob Latham



On 04/08/2014 05:49 PM, Daniel Milroy wrote:

Hello,

The file system in question is indeed Lustre, and mounting with flock
isn’t possible in our environment.  I recommended the following changes
to the users’ code:


Hi.  I'm the ROMIO guy, though I do rely on the community to help me 
keep the lustre driver up to snuff.



MPI_Info_set(info, "collective_buffering", "true");
MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable");
MPI_Info_set(info, "romio_ds_read", "disable");
MPI_Info_set(info, "romio_ds_write", "disable");

Which results in the same error as before.  Are there any other MPI
options I can set?


I'd like to hear more about the workload generating these lock messages, 
but I can tell you the situations in which ADIOI_SetLock gets called:
- everywhere in NFS.  If you have a Lustre file system exported to some 
clients as NFS, you'll get NFS (er, that might not be true unless you 
pick up a recent patch)
- when writing a non-contiguous region in file, unless you disable data 
sieving, as you did above.
- note: you don't need to disable data sieving for reads, though you 
might want to if the data sieving algorithm is wasting a lot of data.
- if atomic mode was set on the file (i.e. you called 
MPI_File_set_atomicity)

- if you use any of the shared file pointer operations
- if you use any of the ordered mode collective operations

you've turned off data sieving writes, which is what I would have first 
guessed would trigger this lock message.  So I guess you are hitting one 
of the other cases.


==rob

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa

Djordje

Your WRF configure file seems to use mpif90 and mpicc (line 115 & 
following).

In addition, it also seems to have DISABLED OpenMP (NO TRAILING "I")
(lines 109-111, where OpenMP stuff is commented out).
So, it looks like to me your intent was to compile with MPI.

Whether it is THIS MPI (OpenMPI) or another MPI (say MPICH, or MVAPICH,
or Intel MPI, or Cray, or ...) only your environment can tell.

What do you get from these commands:

which mpirun
which mpif90
which mpicc

I never built WRF here (but other people here use it).
Which input do you provide to the command that generates the configure
script that you sent before?
Maybe the full command line will shed some light on the problem.

I hope this helps,
Gus Correa


On 04/14/2014 03:11 PM, Djordje Romanic wrote:

to get help :)



On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic mailto:djord...@gmail.com>> wrote:

Yes, but I was hoping to get. :)


On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres)
mailto:jsquy...@cisco.com>> wrote:

If you didn't use Open MPI, then this is the wrong mailing list
for you.  :-)

(this is the Open MPI users' support mailing list)


On Apr 14, 2014, at 2:58 PM, Djordje Romanic mailto:djord...@gmail.com>> wrote:

 > I didn't use OpenMPI.
 >
 >
 > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)
mailto:jsquy...@cisco.com>> wrote:
 > This can also happen when you compile your application with
one MPI implementation (e.g., Open MPI), but then mistakenly use
the "mpirun" (or "mpiexec") from a different MPI implementation
(e.g., MPICH).
 >
 >
 > On Apr 14, 2014, at 2:32 PM, Djordje Romanic
mailto:djord...@gmail.com>> wrote:
 >
 > > I compiled it with: x86_64 Linux, gfortran compiler with
gcc   (dmpar). dmpar - distributed memory option.
 > >
 > > Attached is the self-generated configuration file. The
architecture specification settings start at line 107. I didn't
use Open MPI (shared memory option).
 > >
 > >
 > > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)
mailto:dgood...@cisco.com>> wrote:
 > > On Apr 14, 2014, at 12:15 PM, Djordje Romanic
mailto:djord...@gmail.com>> wrote:
 > >
 > > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
 > > > -
 > > >  starting wrf task0  of1
 > > >  starting wrf task0  of1
 > > >  starting wrf task0  of1
 > > >  starting wrf task0  of1
 > > > -
 > > > This indicates that it is not using 4 processors, but 1.
 > > >
 > > > Any idea what might be the problem?
 > >
 > > It could be that you compiled WRF with a different MPI
implementation than you are using to run it (e.g., MPICH vs.
Open MPI).
 > >
 > > -Dave
 > >
 > > ___
 > > users mailing list
 > > us...@open-mpi.org 
 > > http://www.open-mpi.org/mailman/listinfo.cgi/users
 > >
 > > ___
 > > users mailing list
 > > us...@open-mpi.org 
 > > http://www.open-mpi.org/mailman/listinfo.cgi/users
 >
 >
 > --
 > Jeff Squyres
 > jsquy...@cisco.com 
 > For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
 >
 > ___
 > users mailing list
 > us...@open-mpi.org 
 > http://www.open-mpi.org/mailman/listinfo.cgi/users
 >
 > ___
 > users mailing list
 > us...@open-mpi.org 
 > http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquy...@cisco.com 
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa

On 04/14/2014 03:02 PM, Jeff Squyres (jsquyres) wrote:

If you didn't use Open MPI, then this is the wrong mailing list for you.  :-)

(this is the Open MPI users' support mailing list)


On Apr 14, 2014, at 2:58 PM, Djordje Romanic  wrote:


I didn't use OpenMPI.


On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)  
wrote:
This can also happen when you compile your application with one MPI implementation (e.g., Open 
MPI), but then mistakenly use the "mpirun" (or "mpiexec") from a different MPI 
implementation (e.g., MPICH).


On Apr 14, 2014, at 2:32 PM, Djordje Romanic  wrote:


I compiled it with: x86_64 Linux, gfortran compiler with gcc   (dmpar). dmpar - 
distributed memory option.

Attached is the self-generated configuration file.

The architecture specification settings start at line 107.
I didn't use Open MPI (shared memory option).

NoNoNoNoNo!

You are confusing yourself (and even Jeff) by mixing up
OpenMPI and OpenMP.
Well, everybody wants to be Open ... :)

Note that OpenMP (no trailing "I") is the WRF shared memory option,
which is different from OpenMPI (with trailing "I") which
is the  MPI implementation of this mailing list
(distributed memory, so to speak, althouth intra-node it is shared mem).

My guess is that your intent is to compile with MPI, right?
And actually with OpenMPI, i.e., with this implementation of MPI, right?

What is the output of "ldd ./wrf.exe"?
This may show the MPI libraries.

I hope this helps,
Gus Correa





On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)  
wrote:
On Apr 14, 2014, at 12:15 PM, Djordje Romanic  wrote:


When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
-
  starting wrf task0  of1
  starting wrf task0  of1
  starting wrf task0  of1
  starting wrf task0  of1
-
This indicates that it is not using 4 processors, but 1.

Any idea what might be the problem?


It could be that you compiled WRF with a different MPI implementation than you 
are using to run it (e.g., MPICH vs. Open MPI).

-Dave

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users







Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa

On 04/14/2014 01:15 PM, Djordje Romanic wrote:

Hi,

I am trying to run WRF-ARW in parallel. This is configuration of my system:
-
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):4
On-line CPU(s) list:   0-3
Thread(s) per core:1
Core(s) per socket:4
Socket(s): 1
NUMA node(s):  1
Vendor ID: AuthenticAMD
CPU family:16
Model: 2
Stepping:  3
CPU MHz:   1150.000
BogoMIPS:  4587.84
Virtualization:AMD-V
L1d cache: 64K
L1i cache: 64K
L2 cache:  512K
L3 cache:  2048K
NUMA node0 CPU(s): 0-3
-

When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
-
  starting wrf task0  of1
  starting wrf task0  of1
  starting wrf task0  of1
  starting wrf task0  of1
-
This indicates that it is not using 4 processors, but 1.

Any idea what might be the problem?

Thanks,
Djordje



Did you compile WRF with MPI enabled (i.e. serial,
not OpenMP, or openMP only, which is a different beast than OpenMPI = 
note the trailing "I") ?


If you compiled serial, or OpenMP and did not set OMP_NUM_THREADS,
the mpirun command above will launch 4
separate/independent/repetitive/non-communicating processes
doing the same exact thing (and probably overwriting each other's 
output, etc).


Did you compile and link to the OpenMPI libraries, or perhaps to
another MPI implementation?

I hope this helps,
Gus Correa


Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
to get help :)



On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic  wrote:

> Yes, but I was hoping to get. :)
>
>
> On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> If you didn't use Open MPI, then this is the wrong mailing list for you.
>>  :-)
>>
>> (this is the Open MPI users' support mailing list)
>>
>>
>> On Apr 14, 2014, at 2:58 PM, Djordje Romanic  wrote:
>>
>> > I didn't use OpenMPI.
>> >
>> >
>> > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres) <
>> jsquy...@cisco.com> wrote:
>> > This can also happen when you compile your application with one MPI
>> implementation (e.g., Open MPI), but then mistakenly use the "mpirun" (or
>> "mpiexec") from a different MPI implementation (e.g., MPICH).
>> >
>> >
>> > On Apr 14, 2014, at 2:32 PM, Djordje Romanic 
>> wrote:
>> >
>> > > I compiled it with: x86_64 Linux, gfortran compiler with gcc
>> (dmpar). dmpar - distributed memory option.
>> > >
>> > > Attached is the self-generated configuration file. The architecture
>> specification settings start at line 107. I didn't use Open MPI (shared
>> memory option).
>> > >
>> > >
>> > > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell) <
>> dgood...@cisco.com> wrote:
>> > > On Apr 14, 2014, at 12:15 PM, Djordje Romanic 
>> wrote:
>> > >
>> > > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
>> > > > -
>> > > >  starting wrf task0  of1
>> > > >  starting wrf task0  of1
>> > > >  starting wrf task0  of1
>> > > >  starting wrf task0  of1
>> > > > -
>> > > > This indicates that it is not using 4 processors, but 1.
>> > > >
>> > > > Any idea what might be the problem?
>> > >
>> > > It could be that you compiled WRF with a different MPI implementation
>> than you are using to run it (e.g., MPICH vs. Open MPI).
>> > >
>> > > -Dave
>> > >
>> > > ___
>> > > users mailing list
>> > > us...@open-mpi.org
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > >
>> > > ___
>> > > users mailing list
>> > > us...@open-mpi.org
>> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> >
>> > --
>> > Jeff Squyres
>> > jsquy...@cisco.com
>> > For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>


Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
Yes, but I was hoping to get. :)


On Mon, Apr 14, 2014 at 3:02 PM, Jeff Squyres (jsquyres)  wrote:

> If you didn't use Open MPI, then this is the wrong mailing list for you.
>  :-)
>
> (this is the Open MPI users' support mailing list)
>
>
> On Apr 14, 2014, at 2:58 PM, Djordje Romanic  wrote:
>
> > I didn't use OpenMPI.
> >
> >
> > On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > This can also happen when you compile your application with one MPI
> implementation (e.g., Open MPI), but then mistakenly use the "mpirun" (or
> "mpiexec") from a different MPI implementation (e.g., MPICH).
> >
> >
> > On Apr 14, 2014, at 2:32 PM, Djordje Romanic  wrote:
> >
> > > I compiled it with: x86_64 Linux, gfortran compiler with gcc
> (dmpar). dmpar - distributed memory option.
> > >
> > > Attached is the self-generated configuration file. The architecture
> specification settings start at line 107. I didn't use Open MPI (shared
> memory option).
> > >
> > >
> > > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell) <
> dgood...@cisco.com> wrote:
> > > On Apr 14, 2014, at 12:15 PM, Djordje Romanic 
> wrote:
> > >
> > > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
> > > > -
> > > >  starting wrf task0  of1
> > > >  starting wrf task0  of1
> > > >  starting wrf task0  of1
> > > >  starting wrf task0  of1
> > > > -
> > > > This indicates that it is not using 4 processors, but 1.
> > > >
> > > > Any idea what might be the problem?
> > >
> > > It could be that you compiled WRF with a different MPI implementation
> than you are using to run it (e.g., MPICH vs. Open MPI).
> > >
> > > -Dave
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Jeff Squyres (jsquyres)
If you didn't use Open MPI, then this is the wrong mailing list for you.  :-)

(this is the Open MPI users' support mailing list)


On Apr 14, 2014, at 2:58 PM, Djordje Romanic  wrote:

> I didn't use OpenMPI. 
> 
> 
> On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)  
> wrote:
> This can also happen when you compile your application with one MPI 
> implementation (e.g., Open MPI), but then mistakenly use the "mpirun" (or 
> "mpiexec") from a different MPI implementation (e.g., MPICH).
> 
> 
> On Apr 14, 2014, at 2:32 PM, Djordje Romanic  wrote:
> 
> > I compiled it with: x86_64 Linux, gfortran compiler with gcc   (dmpar). 
> > dmpar - distributed memory option.
> >
> > Attached is the self-generated configuration file. The architecture 
> > specification settings start at line 107. I didn't use Open MPI (shared 
> > memory option).
> >
> >
> > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell) 
> >  wrote:
> > On Apr 14, 2014, at 12:15 PM, Djordje Romanic  wrote:
> >
> > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
> > > -
> > >  starting wrf task0  of1
> > >  starting wrf task0  of1
> > >  starting wrf task0  of1
> > >  starting wrf task0  of1
> > > -
> > > This indicates that it is not using 4 processors, but 1.
> > >
> > > Any idea what might be the problem?
> >
> > It could be that you compiled WRF with a different MPI implementation than 
> > you are using to run it (e.g., MPICH vs. Open MPI).
> >
> > -Dave
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
I didn't use OpenMPI.


On Mon, Apr 14, 2014 at 2:37 PM, Jeff Squyres (jsquyres)  wrote:

> This can also happen when you compile your application with one MPI
> implementation (e.g., Open MPI), but then mistakenly use the "mpirun" (or
> "mpiexec") from a different MPI implementation (e.g., MPICH).
>
>
> On Apr 14, 2014, at 2:32 PM, Djordje Romanic  wrote:
>
> > I compiled it with: x86_64 Linux, gfortran compiler with gcc   (dmpar).
> dmpar - distributed memory option.
> >
> > Attached is the self-generated configuration file. The architecture
> specification settings start at line 107. I didn't use Open MPI (shared
> memory option).
> >
> >
> > On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell) <
> dgood...@cisco.com> wrote:
> > On Apr 14, 2014, at 12:15 PM, Djordje Romanic 
> wrote:
> >
> > > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
> > > -
> > >  starting wrf task0  of1
> > >  starting wrf task0  of1
> > >  starting wrf task0  of1
> > >  starting wrf task0  of1
> > > -
> > > This indicates that it is not using 4 processors, but 1.
> > >
> > > Any idea what might be the problem?
> >
> > It could be that you compiled WRF with a different MPI implementation
> than you are using to run it (e.g., MPICH vs. Open MPI).
> >
> > -Dave
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Jeff Squyres (jsquyres)
This can also happen when you compile your application with one MPI 
implementation (e.g., Open MPI), but then mistakenly use the "mpirun" (or 
"mpiexec") from a different MPI implementation (e.g., MPICH).


On Apr 14, 2014, at 2:32 PM, Djordje Romanic  wrote:

> I compiled it with: x86_64 Linux, gfortran compiler with gcc   (dmpar). dmpar 
> - distributed memory option. 
> 
> Attached is the self-generated configuration file. The architecture 
> specification settings start at line 107. I didn't use Open MPI (shared 
> memory option). 
> 
> 
> On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)  
> wrote:
> On Apr 14, 2014, at 12:15 PM, Djordje Romanic  wrote:
> 
> > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
> > -
> >  starting wrf task0  of1
> >  starting wrf task0  of1
> >  starting wrf task0  of1
> >  starting wrf task0  of1
> > -
> > This indicates that it is not using 4 processors, but 1.
> >
> > Any idea what might be the problem?
> 
> It could be that you compiled WRF with a different MPI implementation than 
> you are using to run it (e.g., MPICH vs. Open MPI).
> 
> -Dave
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
I compiled it with: x86_64 Linux, gfortran compiler with gcc   (dmpar).
dmpar - distributed memory option.

Attached is the self-generated configuration file. The architecture
specification settings start at line 107. I didn't use Open MPI (shared
memory option).


On Mon, Apr 14, 2014 at 1:23 PM, Dave Goodell (dgoodell)  wrote:

> On Apr 14, 2014, at 12:15 PM, Djordje Romanic  wrote:
>
> > When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
> > -
> >  starting wrf task0  of1
> >  starting wrf task0  of1
> >  starting wrf task0  of1
> >  starting wrf task0  of1
> > -
> > This indicates that it is not using 4 processors, but 1.
> >
> > Any idea what might be the problem?
>
> It could be that you compiled WRF with a different MPI implementation than
> you are using to run it (e.g., MPICH vs. Open MPI).
>
> -Dave
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


configure.wrf
Description: Binary data


Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Dave Goodell (dgoodell)
On Apr 14, 2014, at 12:15 PM, Djordje Romanic  wrote:

> When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
> -
>  starting wrf task0  of1
>  starting wrf task0  of1
>  starting wrf task0  of1
>  starting wrf task0  of1
> -
> This indicates that it is not using 4 processors, but 1. 
> 
> Any idea what might be the problem? 

It could be that you compiled WRF with a different MPI implementation than you 
are using to run it (e.g., MPICH vs. Open MPI).

-Dave



[OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Djordje Romanic
Hi,

I am trying to run WRF-ARW in parallel. This is configuration of my system:
-
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):4
On-line CPU(s) list:   0-3
Thread(s) per core:1
Core(s) per socket:4
Socket(s): 1
NUMA node(s):  1
Vendor ID: AuthenticAMD
CPU family:16
Model: 2
Stepping:  3
CPU MHz:   1150.000
BogoMIPS:  4587.84
Virtualization:AMD-V
L1d cache: 64K
L1i cache: 64K
L2 cache:  512K
L3 cache:  2048K
NUMA node0 CPU(s): 0-3
-

When I start wrf with mpirun -np 4 ./wrf.exe, I get this:
-
 starting wrf task0  of1
 starting wrf task0  of1
 starting wrf task0  of1
 starting wrf task0  of1
-
This indicates that it is not using 4 processors, but 1.

Any idea what might be the problem?

Thanks,
Djordje


Re: [OMPI users] Performance issue of mpirun/mpi_init

2014-04-14 Thread Ralph Castain
I'm still poking around, but would appreciate a little more info to ensure I'm 
looking in the right places. How many nodes are you running your application 
across for your verification suite? I suspect it isn't just one :-)


On Apr 10, 2014, at 9:19 PM, Ralph Castain  wrote:

> I shaved about 30% off the time - the patch is waiting for 1.8.1, but you can 
> try it now (see the ticket for the changeset):
> 
> https://svn.open-mpi.org/trac/ompi/ticket/4510#comment:1
> 
> I've added you to the ticket so you can follow what I'm doing. Getting any 
> further improvement will take a little longer due to travel and vacation, but 
> I'll keep poking at it.
> 
> Ralph
> 
> On Apr 10, 2014, at 10:25 AM, Victor Vysotskiy 
>  wrote:
> 
>> Hi again,
>> 
>> > Okay, I'll try to do a little poking around. Meantime, please send along 
>> > the output from >"ompi_info" so we can see how this was configured and 
>> > what built.
>> 
>> enclosed please find the requested information. It would be great to have an 
>> workaround for 1.8 because with 1.8 our  verification suite takes (6.2 hrs) 
>> 2x times longer to complete compared to 1.6.5  (3 hrs).
>> 
>> With best regards,
>> Victor.
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] mpirun problem when running on more than three hosts with OpenMPI 1.8

2014-04-14 Thread Ralph Castain

On Apr 13, 2014, at 11:42 AM, Allan Wu  wrote:

> Thanks, Ralph!
> 
> Adding MAC parameter 'plm_rsh_no_tree_spawn' solves the problem. 
> 
> If I understand correctly, the first layer of daemons are three nodes, and 
> when there are more than three nodes the second layer of daemons are spawn. 
> So my problem is happened when MPI processes are launched by the second layer 
> of daemons, is that correct?

Yes, that is correct

> I think that is very likely, the second layer of daemons may be missing some 
> environmental settings. 
> I would be really helpful if I can solve the problem though, is there any 
> documents I can find on the way the daemons work? Do you have any suggestions 
> on the way I can debug the issue?

Easiest way to debug the issue is to add "-mca plm_base_verbose 5 
--debug-daemons" to your command line. This will show the commands being used 
in the launch, and allow ssh errors to reach the screen.


> 
> Thanks,
> Allan 
> 
> On Sat, Apr 12, 2014 at 9:00 AM,  wrote:
> 
> The problem is with the tree-spawn nature of the rsh/ssh launcher. For 
> scalability, mpirun only launches a first "layer" of daemons. Each of those 
> daemons then launches another layer in a tree-like fanout. The default 
> pattern is such that you first notice it when you have four nodes in your 
> allocation.
> 
> You have two choices:
> 
> * you can just add the MCA param plm_rsh_no_tree_spawn=1 to your 
> environment/cmd line
> 
> * you can resolve the tree spawn issue so that a daemon on one of your nodes 
> is capable of ssh-ing a daemon on another node
> 
> Either way will work.
> Ralph
> 
> 
> On Apr 11, 2014, at 11:17 AM, Allan Wu  wrote:
> 
> > Hello everyone,
> >
> > I am running a simple helloworld program on several nodes using OpenMPI 
> > 1.8. Running commands on single node or small number of nodes are 
> > successful, but when I tried to run the same binary on four different 
> > nodes, problems occurred.
> >
> > I am using 'mpirun' command line like the following:
> > # mpirun --prefix /mnt/embedded_root/openmpi -np 4 --map-by node -hostfile 
> > hostfile ./helloworld
> > And my hostfile looks something like these:
> > 10.0.0.16
> > 10.0.0.17
> > 10.0.0.18
> > 10.0.0.19
> >
> > When executing this command, it will result in an error message "sh: syntax 
> > error: unexpected word", and the program will deadlock. When I added 
> > "--debug-devel" the output is in the attachment "err_msg_0.txt". In the 
> > log, "fpga0" is the hostname of "10.0.0.16" and "fpga1" is for "10.0.0.17" 
> > and so on.
> >
> > However, the weird part is that after I remove one line in the hostfile, 
> > the problem goes away. It does not matter which host I remove, as long as 
> > there is less than four hosts, the program can execute without any problem.
> >
> > I also tried using hostname in the hostfile, as:
> > fpga0
> > fpga1
> > fpga2
> > fpga3
> > And the same problem occurs, and the error message becomes "Host key 
> > verification failed.". I have setup public/private key pairs on all nodes, 
> > and each node can ssh to any node without problems. I also attached the 
> > message of --debug-devel as "err_msg_1.txt".
> >
> > I'm running MPI programs on embedded ARM processors. I have previously 
> > posted questions on cross-compilation on the develop mailing list, which 
> > contains the setup I used. If you need the information please refer to 
> > http://www.open-mpi.org/community/lists/devel/2014/04/14440.php, and the 
> > output of 'ompi-info --all' is also attached with this email.
> >
> > Please let me know if I need to provide more information. Thanks in advance!
> >
> > Regards,
> > --
> > Di Wu (Allan)
> > PhD student, VAST Laboratory,
> > Department of Computer Science, UC Los Angeles
> > Email: al...@cs.ucla.edu
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] can't run mpi-jobs on remote host

2014-04-14 Thread Ralph Castain
I'm confused - how are you building OMPI?? You normally have to do:

1. ./configure --prefix=   This is where you would add --enable-debug

2. make clean all install

You then run your mpirun command as you've done. 


On Apr 14, 2014, at 12:52 AM, Lubrano Francesco 
 wrote:

> I can't set --enable-debug (command not found: I have just --enable-recovery 
> in help command), but the other commands works properly. The output is:
> 
> francesco@linux-hldu:~> mpirun -mca plm_base_verbose 10 --debug-daemons 
> --host Frank@158.110.39.110 hostname
> [linux-hldu.site:02234] mca: base: components_register: registering plm 
> components
> [linux-hldu.site:02234] mca: base: components_register: found loaded 
> component isolated
> [linux-hldu.site:02234] mca: base: components_register: component isolated 
> has no register or open function
> [linux-hldu.site:02234] mca: base: components_register: found loaded 
> component rsh
> [linux-hldu.site:02234] mca: base: components_register: component rsh 
> register function successful
> [linux-hldu.site:02234] mca: base: components_register: found loaded 
> component slurm
> [linux-hldu.site:02234] mca: base: components_register: component slurm 
> register function successful
> [linux-hldu.site:02234] mca: base: components_open: opening plm components
> [linux-hldu.site:02234] mca: base: components_open: found loaded component 
> isolated
> [linux-hldu.site:02234] mca: base: components_open: component isolated open 
> function successful
> [linux-hldu.site:02234] mca: base: components_open: found loaded component rsh
> [linux-hldu.site:02234] mca: base: components_open: component rsh open 
> function successful
> [linux-hldu.site:02234] mca: base: components_open: found loaded component 
> slurm
> [linux-hldu.site:02234] mca: base: components_open: component slurm open 
> function successful
> [linux-hldu.site:02234] mca:base:select: Auto-selecting plm components
> [linux-hldu.site:02234] mca:base:select:(  plm) Querying component [isolated]
> [linux-hldu.site:02234] mca:base:select:(  plm) Query of component [isolated] 
> set priority to 0
> [linux-hldu.site:02234] mca:base:select:(  plm) Querying component [rsh]
> [linux-hldu.site:02234] mca:base:select:(  plm) Query of component [rsh] set 
> priority to 10
> [linux-hldu.site:02234] mca:base:select:(  plm) Querying component [slurm]
> [linux-hldu.site:02234] mca:base:select:(  plm) Skipping component [slurm]. 
> Query failed to return a module
> [linux-hldu.site:02234] mca:base:select:(  plm) Selected component [rsh]
> [linux-hldu.site:02234] mca: base: close: component isolated closed
> [linux-hldu.site:02234] mca: base: close: unloading component isolated
> [linux-hldu.site:02234] mca: base: close: component slurm closed
> [linux-hldu.site:02234] mca: base: close: unloading component slurm
> Daemon was launched on linux-o5sl.site - beginning to initialize
> [linux-o5sl.site:02271] mca: base: components_register: registering plm 
> components
> [linux-o5sl.site:02271] mca: base: components_register: found loaded 
> component rsh
> [linux-o5sl.site:02271] mca: base: components_register: component rsh 
> register function successful
> [linux-o5sl.site:02271] mca: base: components_open: opening plm components
> [linux-o5sl.site:02271] mca: base: components_open: found loaded component rsh
> [linux-o5sl.site:02271] mca: base: components_open: component rsh open 
> function successful
> [linux-o5sl.site:02271] mca:base:select: Auto-selecting plm components
> [linux-o5sl.site:02271] mca:base:select:(  plm) Querying component [rsh]
> [linux-o5sl.site:02271] mca:base:select:(  plm) Query of component [rsh] set 
> priority to 10
> [linux-o5sl.site:02271] mca:base:select:(  plm) Selected component [rsh]
> Daemon [[33734,0],1] checking in as pid 2271 on host linux-o5sl
> [linux-o5sl.site:02271] [[33734,0],1] orted: up and running - waiting for 
> commands!
> [linux-o5sl.site:02271] mca: base: close: component rsh closed
> [linux-o5sl.site:02271] mca: base: close: unloading component rsh
> [linux-hldu.site:02234] [[33734,0],0] orted_cmd: received exit cmd
> [linux-hldu.site:02234] [[33734,0],0] orted_cmd: all routes and children gone 
> - exiting
> [linux-hldu.site:02234] mca: base: close: component rsh closed
> [linux-hldu.site:02234] mca: base: close: unloading component rsh
> 
> Is orted in linux-05sl reciving any command?
> Thank you for your cooperation
> 
> (I don't know if it matter, but I have the same problem using the first pc as 
> remote and the second as local).
> 
> regards
> 
> Francesco
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Different output ubuntu /mac

2014-04-14 Thread Reuti
Am 13.04.2014 um 09:58 schrieb Kamal:

> I have a code which uses both mpicc and mpif90.
> 
> The code is to read a file from the directory, It works properly on my 
> desktop (ubuntu) but when I run the same code on my Macbook I get fopen 
> failure errno : 2 ( file does not exist )

Without more information it looks like an error in the application and not Open 
MPI. Worth to note is that Mac users are in /Users and not /home.

-- Reuti


> Could some one please tell me what might be the problem ?
> 
> 
> Thanks,
> Bow
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] openmpi-1.7.4/1.8 .0 problem with intel/mpi_sizeof

2014-04-14 Thread Luis Kornblueh

Dear all,

the attached mympi_test.f90 does not compile with intel and OpenMPI 
Version 1.7.4, apparently it also does not compile with 1.8.0.


The Intel Compiler version is 14.0.2.

tmp/ifortjKG1cP.o: In function `MAIN__':
mympi_test.f90:(.text+0x90): undefined reference to `mpi_sizeof0di4_'

This is very similar to an error reported for older versions 1.4.x and 
1.5x for the the Portland Group compiler:


https://www.open-mpi.org/community/lists/devel/2010/09/8443.php

Obviously this got fixed with version 1.6.*, and this version is working 
with the intel compiler as well.


Cheerio,
Luis
--
 \\
 (-0^0-)
--oOO--(_)--OOo-

 Luis Kornblueh   Tel. : +49-40-41173289
 Max-Planck-Institute for Meteorology Fax. : +49-40-41173298
 Bundesstr. 53
 D-20146 Hamburg   Email: luis.kornbl...@zmaw.de
 Federal Republic of Germany
program mympi_test

  use mpi

  implicit none

  integer   :: size, my_pe, comm, ierror, i
  integer   :: status(MPI_STATUS_SIZE)
  INTEGER   :: iig = 0
  INTEGER   :: p_int_byte = 0

  call mpi_init(ierror)

  comm = MPI_COMM_WORLD

  call mpi_comm_size(comm,size,ierror)
  call mpi_comm_rank(comm,my_pe,ierror)
  call mpi_sizeof(iig, p_int_byte, ierror)

  write(*,*) 'MPI_COMM_WORLD: ', comm, MPI_COMM_WORLD
  write(*,*) 'I am PE ', my_pe, p_int_byte

  call mpi_finalize(ierror)

end program mympi_test



[OMPI users] Different output ubuntu /mac

2014-04-14 Thread Kamal

Hi,

I have a code which uses both mpicc and mpif90.

The code is to read a file from the directory, It works properly on my 
desktop (ubuntu) but when I run the same code on my Macbook I get fopen 
failure errno : 2 ( file does not exist )



Could some one please tell me what might be the problem ?


Thanks,
Bow


Re: [OMPI users] can't run mpi-jobs on remote host

2014-04-14 Thread Lubrano Francesco
I can't set --enable-debug (command not found: I have just --enable-recovery in 
help command), but the other commands works properly. The output is:

francesco@linux-hldu:~> mpirun -mca plm_base_verbose 10 --debug-daemons --host 
Frank@158.110.39.110 hostname
[linux-hldu.site:02234] mca: base: components_register: registering plm 
components
[linux-hldu.site:02234] mca: base: components_register: found loaded component 
isolated
[linux-hldu.site:02234] mca: base: components_register: component isolated has 
no register or open function
[linux-hldu.site:02234] mca: base: components_register: found loaded component 
rsh
[linux-hldu.site:02234] mca: base: components_register: component rsh register 
function successful
[linux-hldu.site:02234] mca: base: components_register: found loaded component 
slurm
[linux-hldu.site:02234] mca: base: components_register: component slurm 
register function successful
[linux-hldu.site:02234] mca: base: components_open: opening plm components
[linux-hldu.site:02234] mca: base: components_open: found loaded component 
isolated
[linux-hldu.site:02234] mca: base: components_open: component isolated open 
function successful
[linux-hldu.site:02234] mca: base: components_open: found loaded component rsh
[linux-hldu.site:02234] mca: base: components_open: component rsh open function 
successful
[linux-hldu.site:02234] mca: base: components_open: found loaded component slurm
[linux-hldu.site:02234] mca: base: components_open: component slurm open 
function successful
[linux-hldu.site:02234] mca:base:select: Auto-selecting plm components
[linux-hldu.site:02234] mca:base:select:(  plm) Querying component [isolated]
[linux-hldu.site:02234] mca:base:select:(  plm) Query of component [isolated] 
set priority to 0
[linux-hldu.site:02234] mca:base:select:(  plm) Querying component [rsh]
[linux-hldu.site:02234] mca:base:select:(  plm) Query of component [rsh] set 
priority to 10
[linux-hldu.site:02234] mca:base:select:(  plm) Querying component [slurm]
[linux-hldu.site:02234] mca:base:select:(  plm) Skipping component [slurm]. 
Query failed to return a module
[linux-hldu.site:02234] mca:base:select:(  plm) Selected component [rsh]
[linux-hldu.site:02234] mca: base: close: component isolated closed
[linux-hldu.site:02234] mca: base: close: unloading component isolated
[linux-hldu.site:02234] mca: base: close: component slurm closed
[linux-hldu.site:02234] mca: base: close: unloading component slurm
Daemon was launched on linux-o5sl.site - beginning to initialize
[linux-o5sl.site:02271] mca: base: components_register: registering plm 
components
[linux-o5sl.site:02271] mca: base: components_register: found loaded component 
rsh
[linux-o5sl.site:02271] mca: base: components_register: component rsh register 
function successful
[linux-o5sl.site:02271] mca: base: components_open: opening plm components
[linux-o5sl.site:02271] mca: base: components_open: found loaded component rsh
[linux-o5sl.site:02271] mca: base: components_open: component rsh open function 
successful
[linux-o5sl.site:02271] mca:base:select: Auto-selecting plm components
[linux-o5sl.site:02271] mca:base:select:(  plm) Querying component [rsh]
[linux-o5sl.site:02271] mca:base:select:(  plm) Query of component [rsh] set 
priority to 10
[linux-o5sl.site:02271] mca:base:select:(  plm) Selected component [rsh]
Daemon [[33734,0],1] checking in as pid 2271 on host linux-o5sl
[linux-o5sl.site:02271] [[33734,0],1] orted: up and running - waiting for 
commands!
[linux-o5sl.site:02271] mca: base: close: component rsh closed
[linux-o5sl.site:02271] mca: base: close: unloading component rsh
[linux-hldu.site:02234] [[33734,0],0] orted_cmd: received exit cmd
[linux-hldu.site:02234] [[33734,0],0] orted_cmd: all routes and children gone - 
exiting
[linux-hldu.site:02234] mca: base: close: component rsh closed
[linux-hldu.site:02234] mca: base: close: unloading component rsh

Is orted in linux-05sl reciving any command?
Thank you for your cooperation

(I don't know if it matter, but I have the same problem using the first pc as 
remote and the second as local).

regards

Francesco



Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-14 Thread Daniel Milroy
Hello Jeff,

I will pass your recommendation to the users and apprise you when I receive a 
response.


Thank you,

Dan Milroy

-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres 
(jsquyres)
Sent: Friday, April 11, 2014 6:45 AM
To: Open MPI Users
Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

Sorry for the delay in replying.

Can you try upgrading to Open MPI 1.8, which was released last week?  We 
refreshed the version of ROMIO that is included in OMPI 1.8 vs. 1.6.


On Apr 8, 2014, at 6:49 PM, Daniel Milroy  wrote:

> Hello,
>  
> Recently a couple of our users have experienced difficulties with compute 
> jobs failing with OpenMPI 1.6.4 compiled against GCC 4.7.2, with the nodes 
> running kernel 2.6.32-279.5.2.el6.x86_64.  The error is:
>  
> File locking failed in ADIOI_Set_lock(fd 7,cmd F_SETLKW/7,type 
> F_WRLCK/1,whence 0) with return value  and errno 26.
> - If the file system is NFS, you need to use NFS version 3, ensure that the 
> lockd daemon is running on all the machines, and mount the directory with the 
> 'noac' option (no attribute caching).
> - If the file system is LUSTRE, ensure that the directory is mounted with the 
> 'flock' option.
> ADIOI_Set_lock:: Function not implemented ADIOI_Set_lock:offset 0, 
> length 8
>  
> The file system in question is indeed Lustre, and mounting with flock isn't 
> possible in our environment.  I recommended the following changes to the 
> users' code:
>  
> MPI_Info_set(info, "collective_buffering", "true"); MPI_Info_set(info, 
> "romio_lustre_ds_in_coll", "disable"); MPI_Info_set(info, 
> "romio_ds_read", "disable"); MPI_Info_set(info, "romio_ds_write", 
> "disable");
>  
> Which results in the same error as before.  Are there any other MPI options I 
> can set?
>  
>  
> Thank you in advance for any advice,
>  
> Dan Milroy
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users