Re: [OMPI users] UC EXTERNAL: Re: How to set up state-less node /tmpfor OpenMPI usage

2011-11-04 Thread David Turner

Indeed, my terminology is inexact.  I believe you are correct; our
diskless nodes use tmpfs, not ramdisk.  Thanks for the clarification!

On 11/4/11 11:00 AM, Rushton Martin wrote:

There appears to be some confusion about ramdisks and tmpfs.  A ramdisk
sets aside a fixed amount of memory for its exclusive use, so that a
file being written to ramdisk goes first to the cache, then to ramdisk,
and may exist in both for some time.  tmpfs however opens up the cache
to programs so that a file being written goes to cache and stays there.
The "size" of a tmpfs pseudo-disk is the maximum it can grow to (which
according to the mount man page defaults to 50% of memory).  Hence only
enough memory to hold the data is actually used which ties up with David
Turner's figures.

You can easily tell which method is in use from df.  A traditional
ramdisk will appears as /dev/ramN (N = 0, 1 ...) whereas a tmpfs device
will be a simple name, often tmpfs.  I would guess that the single "-"
in David's df command is precisely this.  On our diskless nodes root
shows as device compute_x86_64, whilst /tmp, /dev/shm and /var/tmp show
as "none".

HTH,

Martin Rushton
HPC System Manager, Weapons Technologies
Tel: 01959 514777, Mobile: 07939 219057
email: jmrush...@qinetiq.com
www.QinetiQ.com
QinetiQ - Delivering customer-focused solutions

Please consider the environment before printing this email.
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Blosch, Edwin L
Sent: 04 November 2011 16:19
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmpfor OpenMPI usage

OK, I wouldn't have guessed that the space for /tmp isn't actually in
RAM until it's needed.  That's the key piece of knowledge I was missing;
I really appreciate it.  So you can allow /tmp to be reasonably sized,
but if you aren't actually using it, then it doesn't take up 11 GB of
RAM.  And you prevent users from crashing the node by setting mem limit
to 4 GB less than the available memory. Got it.

I agree with your earlier comment:  these are fairly common systems now.
We have program- and owner-specific disks where I work, and after the
program ends, the disks are archived or destroyed.  Before the stateless
configuration option, the entire computer, nodes and switches as well as
disks, were archived or destroyed after each program.  Not too
cost-effective.

Is this a reasonable final summary? :  OpenMPI uses temporary files in
such a way that it is performance-critical that these so-called session
files, used for shared-memory communications, must be "local".  For
state-less clusters, this means the node image must include a /tmp or
/wrk partition, intelligently sized so as not to enable an application
to exhaust the physical memory of the node, and care must be taken not
to mask this in-memory /tmp with an NFS mounted filesystem.  It is not
uncommon for cluster enablers to exclude /tmp from a typical base Linux
filesystem image or mount it over NFS, as a means of providing users
with a larger-sized /tmp that is not limited to a fraction of the node's
physical memory, or to avoid garbage accumulation in /tmp taking up the
physical RAM.  But not having /tmp or mounting it over NFS is not a
viable stateless-node configuration option if you intend to run OpenMPI.
Instead you could have a /bigtmp which is NFS-mounted and a /tmp whi!
  ch is local, for example. Starting in OpenMPI 1.7.x, shared-memory
communication will no longer go through memory-mapped files, and
vendors/users will no longer need to be vigilant concerning this OpenMPI
performance requirement on stateless node configuration.


Is that a reasonable summary?

If so, would it be helpful to include this as an FAQ entry under General
category?  Or the "shared memory" category?  Or the "troubleshooting"
category?


Thanks



-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Friday, November 04, 2011 1:38 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp for OpenMPI usage

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is consumed by
the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

O

Re: [OMPI users] MPI on MacOS Lion help

2011-11-04 Thread Charles Shelor
I just checked my laptop (also running Lion) and I do have gcc at /usr/bin and 
it is linked to /usr/bin/gcc-4.2.  I just checked again on my Mac Pro and there 
is no gcc in /usr/bin although there is a /usr/bin/gcc-3.3, probably left over 
from an earlier OS or Xcode.  I downloaded and installed Xcode 4.2 and that 
returned gcc to /usr/lib.  I successfully ran

Now Xcode 4.2 does not recognize:  #include 

I know this is an MPI forum rather than OpenMP, but I was wondering if anyone 
knew how to get OpenMP working in Xcode 4.2.  The statement worked fine in 
Xcode 3.6.x

Thanks again for the help!

Charles


On Nov 4, 2011, at 4:41 PM, Barrett, Brian W wrote:

> I think you have something wrong with your Xcode install; on my Lion
> machine, gcc is installed in /usr/bin as always.  Also, on OS X, you
> should never have to set LD_LIBRARY_PATH.
> 
> Brian
> 
> On 11/4/11 3:36 PM, "Ralph Castain"  wrote:
> 
>> Just glancing at the output, it appears to be finding a different gcc
>> that isn't Lion compatible. I know people have been forgetting to clear
>> out all their old installed software, and so you can pick old things up.
>> 
>> Try setting your path and ld_library_path variables to point at the Xcode
>> gcc.
>> 
>> 
>> On Nov 4, 2011, at 3:08 PM, Charles Shelor wrote:
>> 
>>> I had downloaded and installed OpenMPI on my Mac OS-X 10.6 machine a
>>> few months ago.  I ran the configure and install commands from the FAQ
>>> with no problems.  I recently upgraded to Max OS-X 10.7 (Lion) and now
>>> when I run mpicc it cannot find the standard C library headers (stdio.h,
>>> std lib.hŠ)  I had noticed that I had to modify my path to point to the
>>> Xcode gcc executable, /Developer/usr/bin/gcc, (I believe 10.6 included
>>> gcc in /usr/bin, but 10.7 does not appear to include it there now).  So
>>> I figured that I could just reinstall OpenMPI and it would be able to
>>> locate the new libraries.  However, during the "configure" operation I
>>> received the messages below.  I normally use the Apple Xcode IDE for my
>>> initial code development and then compile using mpicc from a terminal
>>> window.  gcc also fails to find the standard libraries from the command
>>> line.  Is there an environment variable that I should set that tells gcc
>>> where the libraries are located?
>>> 
>>> Thank you!
>>> 
>>> Charles
>>> 
>>> 
>>> 
>>> =
>>> ===
>>> == Compiler and preprocessor tests
>>> 
>>> =
>>> ===
>>> 
>>> *** C compiler and preprocessor
>>> checking for style of include used by make... GNU
>>> checking for gcc... gcc
>>> checking for C compiler default output file name...
>>> configure: error: in `/Downloads/openmpi-1.4.3':
>>> configure: error: C compiler cannot create executables
>>> See `config.log' for more details.
>>> 
>>> 
>>> 
>>> Here is what I think is the relevant output from 'config.log'
>>> 
>>> 
>>> 
>>> configure:6362: $? = 0
>>> configure:6369: gcc -v >&5
>>> Using built-in specs.
>>> Target: i686-apple-darwin10
>>> Configured with: /var/tmp/gcc/gcc-5666.3~6/src/configure
>>> --disable-checking --enable-werror --prefix=/usr --mandir=/share/man
>>> --enable-languages=c,objc,c++,obj-c++
>>> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib
>>> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10-
>>> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10
>>> --with-gxx-include-dir=/include/c++/4.2.1
>>> Thread model: posix
>>> gcc version 4.2.1 (Apple Inc. build 5666) (dot 3)
>>> configure:6373: $? = 0
>>> configure:6380: gcc -V >&5
>>> gcc-4.2: argument to `-V' is missing
>>> configure:6384: $? = 1
>>> configure:6407: checking for C compiler default output file name
>>> configure:6429: gcc -DNDEBUGconftest.c  >&5
>>> ld: library not found for -lcrt1.10.6.o
>>> collect2: ld returned 1 exit status
>>> configure:6433: $? = 1
>>> configure:6471: result:
>>> configure: failed program was:
>>> | /* confdefs.h.  */
>>> | #define PACKAGE_NAME "Open MPI"
>>> | #define PACKAGE_TARNAME "openmpi"
>>> | #define PACKAGE_VERSION "1.4.3"
>>> | #define PACKAGE_STRING "Open MPI 1.4.3"
>>> | #define PACKAGE_BUGREPORT "http://www.open-mpi.org/community/help/";
>>> | #define OMPI_MAJOR_VERSION 1
>>> | #define OMPI_MINOR_VERSION 4
>>> | #define OMPI_RELEASE_VERSION 3
>>> | #define OMPI_GREEK_VERSION ""
>>> | #define OMPI_VERSION "3"
>>> | #define OMPI_RELEASE_DATE "Oct 05, 2010"
>>> | #define ORTE_MAJOR_VERSION 1
>>> | #define ORTE_MINOR_VERSION 4
>>> | #define ORTE_RELEASE_VERSION 3
>>> | #define ORTE_GREEK_VERSION ""
>>> | #define ORTE_VERSION "3"
>>> | #define ORTE_RELEASE_DATE "Oct 05, 2010"
>>> | #define OPAL_MAJOR_VERSION 1
>>> | #define OPAL_MINOR_VERSION 4
>>> | #define OPAL_RELEASE_VERSION 3
>>> | #define OPAL_GREEK_VERSION ""
>>> | #define OPAL_VERSION "3"
>>> | #define OPAL_RELEASE_DATE "Oct 05, 2010"
>>> | #define OMPI_EN

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread Blosch, Edwin L
Thanks, Ralph, 

> Having a local /tmp is typically required by Linux for proper operation as 
> the OS itself needs to ensure its usage is protected, as was > previously 
> stated and is reiterated in numerous books on managing Linux systems. 

There is a /tmp, but it's not local.  I don't know if that passes muster as a 
proper setup or not.  I'll gift a Linux book for Christmas to the two reputable 
vendors who have configured diskless clusters for us where /tmp was not local, 
and both /usr/tmp and /var/tmp were linked to /tmp. :)

> IMO, discussions of how to handle /tmp on diskless systems goes beyond the 
> bounds of OMPI - it is a Linux system management issue that > is covered in 
> depth by material on that subject. Explaining how the session directory is 
> used, and why we now include a test and warning if the session directory is 
> going to land on a networked file system (pretty sure this is now in the 1.5 
> series, but certainly is > in the trunk for future releases), would be 
> reasonable.

I know where you're coming from, and I probably didn't title the post correctly 
because I wasn't sure what to ask.  But I definitely saw it, and still see it, 
as an OpenMPI issue.  Having /tmp mounted over NFS on a stateless cluster is 
not a broken configuration, broadly speaking. The vendors made those decisions 
and presumably that's how they do it for other customers as well. There are two 
other (Platform/HP) MPI applications that apparently work normally. But OpenMPI 
doesn't work normally. So it's deficient.

I'll ask the vendor to rebuild the stateless image with a /usr/tmp partition so 
that the end-user application in question can then set orte_tmpdir_base to 
/usr/tmp and all will then work beautifully...

Thanks again,

Ed



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] MPI on MacOS Lion help

2011-11-04 Thread Barrett, Brian W
I think you have something wrong with your Xcode install; on my Lion
machine, gcc is installed in /usr/bin as always.  Also, on OS X, you
should never have to set LD_LIBRARY_PATH.

Brian

On 11/4/11 3:36 PM, "Ralph Castain"  wrote:

>Just glancing at the output, it appears to be finding a different gcc
>that isn't Lion compatible. I know people have been forgetting to clear
>out all their old installed software, and so you can pick old things up.
>
>Try setting your path and ld_library_path variables to point at the Xcode
>gcc.
>
>
>On Nov 4, 2011, at 3:08 PM, Charles Shelor wrote:
>
>> I had downloaded and installed OpenMPI on my Mac OS-X 10.6 machine a
>>few months ago.  I ran the configure and install commands from the FAQ
>>with no problems.  I recently upgraded to Max OS-X 10.7 (Lion) and now
>>when I run mpicc it cannot find the standard C library headers (stdio.h,
>>std lib.hŠ)  I had noticed that I had to modify my path to point to the
>>Xcode gcc executable, /Developer/usr/bin/gcc, (I believe 10.6 included
>>gcc in /usr/bin, but 10.7 does not appear to include it there now).  So
>>I figured that I could just reinstall OpenMPI and it would be able to
>>locate the new libraries.  However, during the "configure" operation I
>>received the messages below.  I normally use the Apple Xcode IDE for my
>>initial code development and then compile using mpicc from a terminal
>>window.  gcc also fails to find the standard libraries from the command
>>line.  Is there an environment variable that I should set that tells gcc
>>where the libraries are located?
>> 
>> Thank you!
>> 
>> Charles
>> 
>> 
>> 
>>=
>>===
>> == Compiler and preprocessor tests
>> 
>>=
>>===
>> 
>> *** C compiler and preprocessor
>> checking for style of include used by make... GNU
>> checking for gcc... gcc
>> checking for C compiler default output file name...
>> configure: error: in `/Downloads/openmpi-1.4.3':
>> configure: error: C compiler cannot create executables
>> See `config.log' for more details.
>> 
>> 
>> 
>> Here is what I think is the relevant output from 'config.log'
>> 
>> 
>> 
>> configure:6362: $? = 0
>> configure:6369: gcc -v >&5
>> Using built-in specs.
>> Target: i686-apple-darwin10
>> Configured with: /var/tmp/gcc/gcc-5666.3~6/src/configure
>>--disable-checking --enable-werror --prefix=/usr --mandir=/share/man
>>--enable-languages=c,objc,c++,obj-c++
>>--program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib
>>--build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10-
>>--host=x86_64-apple-darwin10 --target=i686-apple-darwin10
>>--with-gxx-include-dir=/include/c++/4.2.1
>> Thread model: posix
>> gcc version 4.2.1 (Apple Inc. build 5666) (dot 3)
>> configure:6373: $? = 0
>> configure:6380: gcc -V >&5
>> gcc-4.2: argument to `-V' is missing
>> configure:6384: $? = 1
>> configure:6407: checking for C compiler default output file name
>> configure:6429: gcc -DNDEBUGconftest.c  >&5
>> ld: library not found for -lcrt1.10.6.o
>> collect2: ld returned 1 exit status
>> configure:6433: $? = 1
>> configure:6471: result:
>> configure: failed program was:
>> | /* confdefs.h.  */
>> | #define PACKAGE_NAME "Open MPI"
>> | #define PACKAGE_TARNAME "openmpi"
>> | #define PACKAGE_VERSION "1.4.3"
>> | #define PACKAGE_STRING "Open MPI 1.4.3"
>> | #define PACKAGE_BUGREPORT "http://www.open-mpi.org/community/help/";
>> | #define OMPI_MAJOR_VERSION 1
>> | #define OMPI_MINOR_VERSION 4
>> | #define OMPI_RELEASE_VERSION 3
>> | #define OMPI_GREEK_VERSION ""
>> | #define OMPI_VERSION "3"
>> | #define OMPI_RELEASE_DATE "Oct 05, 2010"
>> | #define ORTE_MAJOR_VERSION 1
>> | #define ORTE_MINOR_VERSION 4
>> | #define ORTE_RELEASE_VERSION 3
>> | #define ORTE_GREEK_VERSION ""
>> | #define ORTE_VERSION "3"
>> | #define ORTE_RELEASE_DATE "Oct 05, 2010"
>> | #define OPAL_MAJOR_VERSION 1
>> | #define OPAL_MINOR_VERSION 4
>> | #define OPAL_RELEASE_VERSION 3
>> | #define OPAL_GREEK_VERSION ""
>> | #define OPAL_VERSION "3"
>> | #define OPAL_RELEASE_DATE "Oct 05, 2010"
>> | #define OMPI_ENABLE_PROGRESS_THREADS 0
>> | #define OMPI_ARCH "x86_64-apple-darwin11.2.0"
>> | #define OMPI_ENABLE_MEM_DEBUG 0
>> | #define OMPI_ENABLE_MEM_PROFILE 0
>> | #define OMPI_ENABLE_DEBUG 0
>> | #define OMPI_GROUP_SPARSE 0
>> | #define OMPI_WANT_MPI_CXX_SEEK 1
>> | #define MPI_PARAM_CHECK ompi_mpi_param_check
>> | #define OMPI_WANT_PRETTY_PRINT_STACKTRACE 1
>> | #define OMPI_WANT_PERUSE 0
>> | #define OMPI_ENABLE_PTY_SUPPORT 1
>> | #define OMPI_ENABLE_HETEROGENEOUS_SUPPORT 0
>> | #define OPAL_ENABLE_TRACE 0
>> | #define ORTE_DISABLE_FULL_SUPPORT 0
>> | #define OPAL_ENABLE_FT 0
>> | #define OPAL_ENABLE_FT_CR 0
>> | #define OMPI_WANT_HOME_CONFIG_FILES 1
>> | #define OPAL_ENABLE_IPV6 1
>> | #define ORTE_WANT_ORTERUN_PREFIX_BY_DEFAULT 0
>> | #define OPAL_PACKAGE_STRING "Open MPI charles@Charles.local
>>Distribution"
>>

Re: [OMPI users] MPI on MacOS Lion help

2011-11-04 Thread Ralph Castain
Just glancing at the output, it appears to be finding a different gcc that 
isn't Lion compatible. I know people have been forgetting to clear out all 
their old installed software, and so you can pick old things up.

Try setting your path and ld_library_path variables to point at the Xcode gcc.


On Nov 4, 2011, at 3:08 PM, Charles Shelor wrote:

> I had downloaded and installed OpenMPI on my Mac OS-X 10.6 machine a few 
> months ago.  I ran the configure and install commands from the FAQ with no 
> problems.  I recently upgraded to Max OS-X 10.7 (Lion) and now when I run 
> mpicc it cannot find the standard C library headers (stdio.h, std lib.h…)  I 
> had noticed that I had to modify my path to point to the Xcode gcc 
> executable, /Developer/usr/bin/gcc, (I believe 10.6 included gcc in /usr/bin, 
> but 10.7 does not appear to include it there now).  So I figured that I could 
> just reinstall OpenMPI and it would be able to locate the new libraries.  
> However, during the "configure" operation I received the messages below.  I 
> normally use the Apple Xcode IDE for my initial code development and then 
> compile using mpicc from a terminal window.  gcc also fails to find the 
> standard libraries from the command line.  Is there an environment variable 
> that I should set that tells gcc where the libraries are located?
> 
> Thank you!
> 
> Charles
> 
> 
> 
> == Compiler and preprocessor tests
> 
> 
> *** C compiler and preprocessor
> checking for style of include used by make... GNU
> checking for gcc... gcc
> checking for C compiler default output file name... 
> configure: error: in `/Downloads/openmpi-1.4.3':
> configure: error: C compiler cannot create executables
> See `config.log' for more details.
> 
> 
> 
> Here is what I think is the relevant output from 'config.log'
> 
> 
> 
> configure:6362: $? = 0
> configure:6369: gcc -v >&5
> Using built-in specs.
> Target: i686-apple-darwin10
> Configured with: /var/tmp/gcc/gcc-5666.3~6/src/configure --disable-checking 
> --enable-werror --prefix=/usr --mandir=/share/man 
> --enable-languages=c,objc,c++,obj-c++ 
> --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib 
> --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- 
> --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 
> --with-gxx-include-dir=/include/c++/4.2.1
> Thread model: posix
> gcc version 4.2.1 (Apple Inc. build 5666) (dot 3)
> configure:6373: $? = 0
> configure:6380: gcc -V >&5
> gcc-4.2: argument to `-V' is missing
> configure:6384: $? = 1
> configure:6407: checking for C compiler default output file name
> configure:6429: gcc -DNDEBUGconftest.c  >&5
> ld: library not found for -lcrt1.10.6.o
> collect2: ld returned 1 exit status
> configure:6433: $? = 1
> configure:6471: result: 
> configure: failed program was:
> | /* confdefs.h.  */
> | #define PACKAGE_NAME "Open MPI"
> | #define PACKAGE_TARNAME "openmpi"
> | #define PACKAGE_VERSION "1.4.3"
> | #define PACKAGE_STRING "Open MPI 1.4.3"
> | #define PACKAGE_BUGREPORT "http://www.open-mpi.org/community/help/";
> | #define OMPI_MAJOR_VERSION 1
> | #define OMPI_MINOR_VERSION 4
> | #define OMPI_RELEASE_VERSION 3
> | #define OMPI_GREEK_VERSION ""
> | #define OMPI_VERSION "3"
> | #define OMPI_RELEASE_DATE "Oct 05, 2010"
> | #define ORTE_MAJOR_VERSION 1
> | #define ORTE_MINOR_VERSION 4
> | #define ORTE_RELEASE_VERSION 3
> | #define ORTE_GREEK_VERSION ""
> | #define ORTE_VERSION "3"
> | #define ORTE_RELEASE_DATE "Oct 05, 2010"
> | #define OPAL_MAJOR_VERSION 1
> | #define OPAL_MINOR_VERSION 4
> | #define OPAL_RELEASE_VERSION 3
> | #define OPAL_GREEK_VERSION ""
> | #define OPAL_VERSION "3"
> | #define OPAL_RELEASE_DATE "Oct 05, 2010"
> | #define OMPI_ENABLE_PROGRESS_THREADS 0
> | #define OMPI_ARCH "x86_64-apple-darwin11.2.0"
> | #define OMPI_ENABLE_MEM_DEBUG 0
> | #define OMPI_ENABLE_MEM_PROFILE 0
> | #define OMPI_ENABLE_DEBUG 0
> | #define OMPI_GROUP_SPARSE 0
> | #define OMPI_WANT_MPI_CXX_SEEK 1
> | #define MPI_PARAM_CHECK ompi_mpi_param_check
> | #define OMPI_WANT_PRETTY_PRINT_STACKTRACE 1
> | #define OMPI_WANT_PERUSE 0
> | #define OMPI_ENABLE_PTY_SUPPORT 1
> | #define OMPI_ENABLE_HETEROGENEOUS_SUPPORT 0
> | #define OPAL_ENABLE_TRACE 0
> | #define ORTE_DISABLE_FULL_SUPPORT 0
> | #define OPAL_ENABLE_FT 0
> | #define OPAL_ENABLE_FT_CR 0
> | #define OMPI_WANT_HOME_CONFIG_FILES 1
> | #define OPAL_ENABLE_IPV6 1
> | #define ORTE_WANT_ORTERUN_PREFIX_BY_DEFAULT 0
> | #define OPAL_PACKAGE_STRING "Open MPI charles@Charles.local Distribution"
> | #define OPAL_IDENT_STRING "1.4.3"
> | #define OMPI_OPENIB_PAD_HDR 0
> | /* end confdefs.h.  */
> | 
> | int
> | main ()
> | {
> | 
> |   ;
> |   return 0;
> | }
> configure:6477: error: in `/Downloads/openmpi-1.4.3':
> configure:6480: error: C compiler cannot create executables
> See `config.log' for more details.
> 
> 

[OMPI users] MPI on MacOS Lion help

2011-11-04 Thread Charles Shelor
I had downloaded and installed OpenMPI on my Mac OS-X 10.6 machine a few months 
ago.  I ran the configure and install commands from the FAQ with no problems.  
I recently upgraded to Max OS-X 10.7 (Lion) and now when I run mpicc it cannot 
find the standard C library headers (stdio.h, std lib.h…)  I had noticed that I 
had to modify my path to point to the Xcode gcc executable, 
/Developer/usr/bin/gcc, (I believe 10.6 included gcc in /usr/bin, but 10.7 does 
not appear to include it there now).  So I figured that I could just reinstall 
OpenMPI and it would be able to locate the new libraries.  However, during the 
"configure" operation I received the messages below.  I normally use the Apple 
Xcode IDE for my initial code development and then compile using mpicc from a 
terminal window.  gcc also fails to find the standard libraries from the 
command line.  Is there an environment variable that I should set that tells 
gcc where the libraries are located?

Thank you!

Charles



== Compiler and preprocessor tests


*** C compiler and preprocessor
checking for style of include used by make... GNU
checking for gcc... gcc
checking for C compiler default output file name... 
configure: error: in `/Downloads/openmpi-1.4.3':
configure: error: C compiler cannot create executables
See `config.log' for more details.



Here is what I think is the relevant output from 'config.log'



configure:6362: $? = 0
configure:6369: gcc -v >&5
Using built-in specs.
Target: i686-apple-darwin10
Configured with: /var/tmp/gcc/gcc-5666.3~6/src/configure --disable-checking 
--enable-werror --prefix=/usr --mandir=/share/man 
--enable-languages=c,objc,c++,obj-c++ 
--program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib 
--build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- 
--host=x86_64-apple-darwin10 --target=i686-apple-darwin10 
--with-gxx-include-dir=/include/c++/4.2.1
Thread model: posix
gcc version 4.2.1 (Apple Inc. build 5666) (dot 3)
configure:6373: $? = 0
configure:6380: gcc -V >&5
gcc-4.2: argument to `-V' is missing
configure:6384: $? = 1
configure:6407: checking for C compiler default output file name
configure:6429: gcc -DNDEBUGconftest.c  >&5
ld: library not found for -lcrt1.10.6.o
collect2: ld returned 1 exit status
configure:6433: $? = 1
configure:6471: result: 
configure: failed program was:
| /* confdefs.h.  */
| #define PACKAGE_NAME "Open MPI"
| #define PACKAGE_TARNAME "openmpi"
| #define PACKAGE_VERSION "1.4.3"
| #define PACKAGE_STRING "Open MPI 1.4.3"
| #define PACKAGE_BUGREPORT "http://www.open-mpi.org/community/help/";
| #define OMPI_MAJOR_VERSION 1
| #define OMPI_MINOR_VERSION 4
| #define OMPI_RELEASE_VERSION 3
| #define OMPI_GREEK_VERSION ""
| #define OMPI_VERSION "3"
| #define OMPI_RELEASE_DATE "Oct 05, 2010"
| #define ORTE_MAJOR_VERSION 1
| #define ORTE_MINOR_VERSION 4
| #define ORTE_RELEASE_VERSION 3
| #define ORTE_GREEK_VERSION ""
| #define ORTE_VERSION "3"
| #define ORTE_RELEASE_DATE "Oct 05, 2010"
| #define OPAL_MAJOR_VERSION 1
| #define OPAL_MINOR_VERSION 4
| #define OPAL_RELEASE_VERSION 3
| #define OPAL_GREEK_VERSION ""
| #define OPAL_VERSION "3"
| #define OPAL_RELEASE_DATE "Oct 05, 2010"
| #define OMPI_ENABLE_PROGRESS_THREADS 0
| #define OMPI_ARCH "x86_64-apple-darwin11.2.0"
| #define OMPI_ENABLE_MEM_DEBUG 0
| #define OMPI_ENABLE_MEM_PROFILE 0
| #define OMPI_ENABLE_DEBUG 0
| #define OMPI_GROUP_SPARSE 0
| #define OMPI_WANT_MPI_CXX_SEEK 1
| #define MPI_PARAM_CHECK ompi_mpi_param_check
| #define OMPI_WANT_PRETTY_PRINT_STACKTRACE 1
| #define OMPI_WANT_PERUSE 0
| #define OMPI_ENABLE_PTY_SUPPORT 1
| #define OMPI_ENABLE_HETEROGENEOUS_SUPPORT 0
| #define OPAL_ENABLE_TRACE 0
| #define ORTE_DISABLE_FULL_SUPPORT 0
| #define OPAL_ENABLE_FT 0
| #define OPAL_ENABLE_FT_CR 0
| #define OMPI_WANT_HOME_CONFIG_FILES 1
| #define OPAL_ENABLE_IPV6 1
| #define ORTE_WANT_ORTERUN_PREFIX_BY_DEFAULT 0
| #define OPAL_PACKAGE_STRING "Open MPI charles@Charles.local Distribution"
| #define OPAL_IDENT_STRING "1.4.3"
| #define OMPI_OPENIB_PAD_HDR 0
| /* end confdefs.h.  */
| 
| int
| main ()
| {
| 
|   ;
|   return 0;
| }
configure:6477: error: in `/Downloads/openmpi-1.4.3':
configure:6480: error: C compiler cannot create executables
See `config.log' for more details.




Re: [OMPI users] UC EXTERNAL: Re: How to set up state-less node /tmpfor OpenMPI usage

2011-11-04 Thread Rushton Martin
There appears to be some confusion about ramdisks and tmpfs.  A ramdisk
sets aside a fixed amount of memory for its exclusive use, so that a
file being written to ramdisk goes first to the cache, then to ramdisk,
and may exist in both for some time.  tmpfs however opens up the cache
to programs so that a file being written goes to cache and stays there.
The "size" of a tmpfs pseudo-disk is the maximum it can grow to (which
according to the mount man page defaults to 50% of memory).  Hence only
enough memory to hold the data is actually used which ties up with David
Turner's figures.

You can easily tell which method is in use from df.  A traditional
ramdisk will appears as /dev/ramN (N = 0, 1 ...) whereas a tmpfs device
will be a simple name, often tmpfs.  I would guess that the single "-"
in David's df command is precisely this.  On our diskless nodes root
shows as device compute_x86_64, whilst /tmp, /dev/shm and /var/tmp show
as "none".

HTH,

Martin Rushton
HPC System Manager, Weapons Technologies
Tel: 01959 514777, Mobile: 07939 219057
email: jmrush...@qinetiq.com
www.QinetiQ.com
QinetiQ - Delivering customer-focused solutions

Please consider the environment before printing this email.
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Blosch, Edwin L
Sent: 04 November 2011 16:19
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmpfor OpenMPI usage

OK, I wouldn't have guessed that the space for /tmp isn't actually in
RAM until it's needed.  That's the key piece of knowledge I was missing;
I really appreciate it.  So you can allow /tmp to be reasonably sized,
but if you aren't actually using it, then it doesn't take up 11 GB of
RAM.  And you prevent users from crashing the node by setting mem limit
to 4 GB less than the available memory. Got it.

I agree with your earlier comment:  these are fairly common systems now.
We have program- and owner-specific disks where I work, and after the
program ends, the disks are archived or destroyed.  Before the stateless
configuration option, the entire computer, nodes and switches as well as
disks, were archived or destroyed after each program.  Not too
cost-effective.

Is this a reasonable final summary? :  OpenMPI uses temporary files in
such a way that it is performance-critical that these so-called session
files, used for shared-memory communications, must be "local".  For
state-less clusters, this means the node image must include a /tmp or
/wrk partition, intelligently sized so as not to enable an application
to exhaust the physical memory of the node, and care must be taken not
to mask this in-memory /tmp with an NFS mounted filesystem.  It is not
uncommon for cluster enablers to exclude /tmp from a typical base Linux
filesystem image or mount it over NFS, as a means of providing users
with a larger-sized /tmp that is not limited to a fraction of the node's
physical memory, or to avoid garbage accumulation in /tmp taking up the
physical RAM.  But not having /tmp or mounting it over NFS is not a
viable stateless-node configuration option if you intend to run OpenMPI.
Instead you could have a /bigtmp which is NFS-mounted and a /tmp whi!
 ch is local, for example. Starting in OpenMPI 1.7.x, shared-memory
communication will no longer go through memory-mapped files, and
vendors/users will no longer need to be vigilant concerning this OpenMPI
performance requirement on stateless node configuration. 


Is that a reasonable summary?

If so, would it be helpful to include this as an FAQ entry under General
category?  Or the "shared memory" category?  Or the "troubleshooting"
category?


Thanks



-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Friday, November 04, 2011 1:38 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp for OpenMPI usage

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is consumed by
the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:
> Thanks very much, exactly what I wanted to hear. How big is /tmp?
>
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread David Turner

I should have been more careful.  When we first started using OpenMPI,
version 1.4.1, there was a bug that caused session directories to be
left behind.  This was fixed in subsequent releases (and via a patch
for 1.4.1).

Our batch epilogue still removes everything in /tmp that belongs to the
owner of the batch job.  It is invoked after the user's application has
terminated, so the session directories are already gone by that time.

Sorry for the confusion!

On 11/4/11 3:43 AM, TERRY DONTJE wrote:

David, are you saying your jobs consistently leave behind session files
after the job exits? It really shouldn't even in the case when a job
aborts, I thought, mpirun took great pains to cleanup after itself. Can
you tell us what version of OMPI you are running with? I think I could
see kill -9 of mpirun and processes below would cause turds to be left
behind.

--td

On 11/4/2011 2:37 AM, David Turner wrote:

% df /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
- 12330084 822848 11507236 7% /
% df /
Filesystem 1K-blocks Used Available Use% Mounted on
- 12330084 822848 11507236 7% /

That works out to 11GB. But...

The compute nodes have 24GB. Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that. I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:

Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node
/tmp
for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway. On our cluster,
all the compute nodes are completely diskless. The root file system,
including /tmp, resides in memory (ramdisk). OpenMPI puts these
session directories therein. All our jobs run through a batch
system (torque). At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user). This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:

Thanks for the help. A couple follow-up-questions, maybe this starts to

go outside OpenMPI:


What's wrong with using /dev/shm? I think you said earlier in this
thread

that this was not a safe place.


If the NFS-mount point is moved from /tmp to /work, would a /tmp
magically

appear in the filesystem for a stateless node? How big would it be,
given
that there is no local disk, right? That may be something I have to
ask the
vendor, which I've tried, but they don't quite seem to get the question.


Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Ralph Castain

Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less
node /tmp

for OpenMPI usage



On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:


I might be missing something here. Is there a side-effect or
performance

loss if you don't use the sm btl? Why would it exist if there is a
wholly
equivalent alternative? What happens to traffic that is intended for
another process on the same node?


There is a definite performance impact, and we wouldn't recommend doing

what Eugene suggested if you care about performance.


The correct solution here is get your sys admin to make /tmp local.
Making

/tmp NFS mounted across multiple nodes is a major "faux pas" in the
Linux
world - it should never be done, for the reasons stated by Jeff.





Thanks


-Original Message-
From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


Right. Actually "--mca btl ^sm". (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a

session file on /tmp, which is NFS-mounted and thus not a good choice.


Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up stat

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread Ralph Castain

On Nov 4, 2011, at 10:19 AM, Blosch, Edwin L wrote:

> OK, I wouldn't have guessed that the space for /tmp isn't actually in RAM 
> until it's needed.  That's the key piece of knowledge I was missing; I really 
> appreciate it.  So you can allow /tmp to be reasonably sized, but if you 
> aren't actually using it, then it doesn't take up 11 GB of RAM.  And you 
> prevent users from crashing the node by setting mem limit to 4 GB less than 
> the available memory. Got it.
> 
> I agree with your earlier comment:  these are fairly common systems now.  We 
> have program- and owner-specific disks where I work, and after the program 
> ends, the disks are archived or destroyed.  Before the stateless 
> configuration option, the entire computer, nodes and switches as well as 
> disks, were archived or destroyed after each program.  Not too cost-effective.
> 
> Is this a reasonable final summary? :  OpenMPI uses temporary files in such a 
> way that it is performance-critical that these so-called session files, used 
> for shared-memory communications, must be "local".  For state-less clusters, 
> this means the node image must include a /tmp or /wrk partition, 
> intelligently sized so as not to enable an application to exhaust the 
> physical memory of the node, and care must be taken not to mask this 
> in-memory /tmp with an NFS mounted filesystem.  


> It is not uncommon for cluster enablers to exclude /tmp from a typical base 
> Linux filesystem image or mount it over NFS, as a means of providing users 
> with a larger-sized /tmp that is not limited to a fraction of the node's 
> physical memory, or to avoid garbage accumulation in /tmp taking up the 
> physical RAM.

Not sure I agree with this statement, but it is irrelevant here.

>  But not having /tmp or mounting it over NFS is not a viable stateless-node 
> configuration option if you intend to run OpenMPI. Instead you could have a 
> /bigtmp which is NFS-mounted and a /tmp whi!
> ch is local, for example. Starting in OpenMPI 1.7.x, shared-memory 
> communication will no longer go through memory-mapped files, and 
> vendors/users will no longer need to be vigilant concerning this OpenMPI 
> performance requirement on stateless node configuration. 

Having a local /tmp is typically required by Linux for proper operation as the 
OS itself needs to ensure its usage is protected, as was previously stated and 
is reiterated in numerous books on managing Linux systems. The "usual" way of 
dealing with what you describe is for sys admins to add a /usr/tmp space which 
is solely intended for use by users, with the understanding that they may stomp 
on each other if they don't take care in naming their files. This is why we 
provided the ability to redirect the placement of the session directories.

> 
> 
> Is that a reasonable summary?
> 
> If so, would it be helpful to include this as an FAQ entry under General 
> category?  Or the "shared memory" category?  Or the "troubleshooting" 
> category?

IMO, discussions of how to handle /tmp on diskless systems goes beyond the 
bounds of OMPI - it is a Linux system management issue that is covered in depth 
by material on that subject. Explaining how the session directory is used, and 
why we now include a test and warning if the session directory is going to land 
on a networked file system (pretty sure this is now in the 1.5 series, but 
certainly is in the trunk for future releases), would be reasonable.

> 
> 
> Thanks
> 
> 
> 
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of David Turner
> Sent: Friday, November 04, 2011 1:38 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
> for OpenMPI usage
> 
> % df /tmp
> Filesystem   1K-blocks  Used Available Use% Mounted on
> - 12330084822848  11507236   7% /
> % df /
> Filesystem   1K-blocks  Used Available Use% Mounted on
> - 12330084822848  11507236   7% /
> 
> That works out to 11GB.  But...
> 
> The compute nodes have 24GB.  Freshly booted, about 3.2GB is
> consumed by the kernel, various services, and the root file system.
> At this time, usage of /tmp is essentially nil.
> 
> We set user memory limits to 20GB.
> 
> I would imagine that the size of the session directories depends on a
> number of factors; perhaps the developers can comment on that.  I have
> only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.
> 
> As long as they're removed after each job, they don't really compete
> with the application for available memory.
> 
> On 11/3/11 8:40 PM, Ed Blosch wrote:
>> Thanks very much, exactly what I wanted to hear. How big is /tmp?
>> 
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
>> Behalf Of David Turner
>> Sent: Thursday, November 03, 2011 6:36 PM
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI use

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread Blosch, Edwin L
OK, I wouldn't have guessed that the space for /tmp isn't actually in RAM until 
it's needed.  That's the key piece of knowledge I was missing; I really 
appreciate it.  So you can allow /tmp to be reasonably sized, but if you aren't 
actually using it, then it doesn't take up 11 GB of RAM.  And you prevent users 
from crashing the node by setting mem limit to 4 GB less than the available 
memory. Got it.

I agree with your earlier comment:  these are fairly common systems now.  We 
have program- and owner-specific disks where I work, and after the program 
ends, the disks are archived or destroyed.  Before the stateless configuration 
option, the entire computer, nodes and switches as well as disks, were archived 
or destroyed after each program.  Not too cost-effective.

Is this a reasonable final summary? :  OpenMPI uses temporary files in such a 
way that it is performance-critical that these so-called session files, used 
for shared-memory communications, must be "local".  For state-less clusters, 
this means the node image must include a /tmp or /wrk partition, intelligently 
sized so as not to enable an application to exhaust the physical memory of the 
node, and care must be taken not to mask this in-memory /tmp with an NFS 
mounted filesystem.  It is not uncommon for cluster enablers to exclude /tmp 
from a typical base Linux filesystem image or mount it over NFS, as a means of 
providing users with a larger-sized /tmp that is not limited to a fraction of 
the node's physical memory, or to avoid garbage accumulation in /tmp taking up 
the physical RAM.  But not having /tmp or mounting it over NFS is not a viable 
stateless-node configuration option if you intend to run OpenMPI. Instead you 
could have a /bigtmp which is NFS-mounted and a /tmp which is local, for 
example. Starting in OpenMPI 1.7.x, shared-memory communication will no longer 
go through memory-mapped files, and vendors/users will no longer need to be 
vigilant concerning this OpenMPI performance requirement on stateless node 
configuration. 


Is that a reasonable summary?

If so, would it be helpful to include this as an FAQ entry under General 
category?  Or the "shared memory" category?  Or the "troubleshooting" category?


Thanks



-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of David Turner
Sent: Friday, November 04, 2011 1:38 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for 
OpenMPI usage

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:
> Thanks very much, exactly what I wanted to hear. How big is /tmp?
>
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> Behalf Of David Turner
> Sent: Thursday, November 03, 2011 6:36 PM
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp
> for OpenMPI usage
>
> I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
> all the compute nodes are completely diskless.  The root file system,
> including /tmp, resides in memory (ramdisk).  OpenMPI puts these
> session directories therein.  All our jobs run through a batch
> system (torque).  At the conclusion of each batch job, an epilogue
> process runs that removes all files belonging to the owner of the
> current batch job from /tmp (and also looks for and kills orphan
> processes belonging to the user).  This epilogue had to written
> by our systems staff.
>
> I believe this is a fairly common configuration for diskless
> clusters.
>
> On 11/3/11 4:09 PM, Blosch, Edwin L wrote:
>> Thanks for the help.  A couple follow-up-questions, maybe this starts to
> go outside OpenMPI:
>>
>> What's wrong with using /dev/shm?  I think you said earlier in this thread
> that this was not a safe place.
>>
>> If the NFS-mount point is moved from /tmp to /work, would a /tmp magically
> appear in the filesystem for a stateless node?  How big would it be, given
> that there is no local disk, right?  That may be something I have to ask the
> vendor, which I've tried, but they don't quite seem to get the question.
>>
>> Thanks
>>

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread TERRY DONTJE
I wasn't advocating against having the epilogue per se, but was more 
curious if there was some issue going on that we did not know about.  If 
there isn't an issue then great.


--td

On 11/4/2011 9:59 AM, Ralph Castain wrote:
That isn't the situation, Terry. We had problems with early OMPI 
releases, particularly the 1.2 series. In response, the labs wrote an 
epilogue to ensure that the session directories were removed. 
Executing the epilogue is now standard operating procedure, even 
though our more recent releases do a much better job of cleanup.


Frankly, it's a good idea anyway. It hurts nothing, takes milliseconds 
to do, and guarantees nothing got left behind (e.g., if someone was 
using a debug version of OMPI and directed opal_output to a file).


On Nov 4, 2011, at 4:43 AM, TERRY DONTJE wrote:

David, are you saying your jobs consistently leave behind session 
files after the job exits?  It really shouldn't even in the case when 
a job aborts, I thought, mpirun took great pains to cleanup after 
itself.Can you tell us what version of OMPI you are running 
with?  I think I could see kill -9 of mpirun and processes below 
would cause turds to be left behind.


--td

On 11/4/2011 2:37 AM, David Turner wrote:

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:

Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org 
[mailto:users-boun...@open-mpi.org] On

Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less 
node /tmp

for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:
Thanks for the help.  A couple follow-up-questions, maybe this 
starts to

go outside OpenMPI:


What's wrong with using /dev/shm?  I think you said earlier in 
this thread

that this was not a safe place.


If the NFS-mount point is moved from /tmp to /work, would a /tmp 
magically
appear in the filesystem for a stateless node?  How big would it 
be, given
that there is no local disk, right?  That may be something I have 
to ask the
vendor, which I've tried, but they don't quite seem to get the 
question.


Thanks




-Original Message-
From: users-boun...@open-mpi.org 
[mailto:users-boun...@open-mpi.org] On

Behalf Of Ralph Castain

Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less 
node /tmp

for OpenMPI usage



On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:

I might be missing something here. Is there a side-effect or 
performance
loss if you don't use the sm btl?  Why would it exist if there is a 
wholly

equivalent alternative?  What happens to traffic that is intended for
another process on the same node?


There is a definite performance impact, and we wouldn't recommend 
doing

what Eugene suggested if you care about performance.


The correct solution here is get your sys admin to make /tmp 
local. Making
/tmp NFS mounted across multiple nodes is a major "faux pas" in the 
Linux

world - it should never be done, for the reasons stated by Jeff.





Thanks


-Original Message-
From: users-boun...@open-mpi.org 
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less 
node

/tmp for OpenMPI usage


Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrot

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread Ralph Castain
That isn't the situation, Terry. We had problems with early OMPI releases, 
particularly the 1.2 series. In response, the labs wrote an epilogue to ensure 
that the session directories were removed. Executing the epilogue is now 
standard operating procedure, even though our more recent releases do a much 
better job of cleanup.

Frankly, it's a good idea anyway. It hurts nothing, takes milliseconds to do, 
and guarantees nothing got left behind (e.g., if someone was using a debug 
version of OMPI and directed opal_output to a file).

On Nov 4, 2011, at 4:43 AM, TERRY DONTJE wrote:

> David, are you saying your jobs consistently leave behind session files after 
> the job exits?  It really shouldn't even in the case when a job aborts, I 
> thought, mpirun took great pains to cleanup after itself.Can you tell us 
> what version of OMPI you are running with?  I think I could see kill -9 of 
> mpirun and processes below would cause turds to be left behind.
> 
> --td
> 
> On 11/4/2011 2:37 AM, David Turner wrote:
>> 
>> % df /tmp 
>> Filesystem   1K-blocks  Used Available Use% Mounted on 
>> - 12330084822848  11507236   7% / 
>> % df / 
>> Filesystem   1K-blocks  Used Available Use% Mounted on 
>> - 12330084822848  11507236   7% / 
>> 
>> That works out to 11GB.  But... 
>> 
>> The compute nodes have 24GB.  Freshly booted, about 3.2GB is 
>> consumed by the kernel, various services, and the root file system. 
>> At this time, usage of /tmp is essentially nil. 
>> 
>> We set user memory limits to 20GB. 
>> 
>> I would imagine that the size of the session directories depends on a 
>> number of factors; perhaps the developers can comment on that.  I have 
>> only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes. 
>> 
>> As long as they're removed after each job, they don't really compete 
>> with the application for available memory. 
>> 
>> On 11/3/11 8:40 PM, Ed Blosch wrote: 
>>> Thanks very much, exactly what I wanted to hear. How big is /tmp? 
>>> 
>>> -Original Message- 
>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
>>> Behalf Of David Turner 
>>> Sent: Thursday, November 03, 2011 6:36 PM 
>>> To: us...@open-mpi.org 
>>> Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
>>> for OpenMPI usage 
>>> 
>>> I'm not a systems guy, but I'll pitch in anyway.  On our cluster, 
>>> all the compute nodes are completely diskless.  The root file system, 
>>> including /tmp, resides in memory (ramdisk).  OpenMPI puts these 
>>> session directories therein.  All our jobs run through a batch 
>>> system (torque).  At the conclusion of each batch job, an epilogue 
>>> process runs that removes all files belonging to the owner of the 
>>> current batch job from /tmp (and also looks for and kills orphan 
>>> processes belonging to the user).  This epilogue had to written 
>>> by our systems staff. 
>>> 
>>> I believe this is a fairly common configuration for diskless 
>>> clusters. 
>>> 
>>> On 11/3/11 4:09 PM, Blosch, Edwin L wrote: 
 Thanks for the help.  A couple follow-up-questions, maybe this starts to 
>>> go outside OpenMPI: 
 
 What's wrong with using /dev/shm?  I think you said earlier in this thread 
>>> that this was not a safe place. 
 
 If the NFS-mount point is moved from /tmp to /work, would a /tmp magically 
>>> appear in the filesystem for a stateless node?  How big would it be, given 
>>> that there is no local disk, right?  That may be something I have to ask 
>>> the 
>>> vendor, which I've tried, but they don't quite seem to get the question. 
 
 Thanks 
 
 
 
 
 -Original Message- 
 From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
>>> Behalf Of Ralph Castain 
 Sent: Thursday, November 03, 2011 5:22 PM 
 To: Open MPI Users 
 Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp 
>>> for OpenMPI usage 
 
 
 On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote: 
 
> I might be missing something here. Is there a side-effect or performance 
>>> loss if you don't use the sm btl?  Why would it exist if there is a wholly 
>>> equivalent alternative?  What happens to traffic that is intended for 
>>> another process on the same node? 
 
 There is a definite performance impact, and we wouldn't recommend doing 
>>> what Eugene suggested if you care about performance. 
 
 The correct solution here is get your sys admin to make /tmp local. Making 
>>> /tmp NFS mounted across multiple nodes is a major "faux pas" in the Linux 
>>> world - it should never be done, for the reasons stated by Jeff. 
 
 
> 
> Thanks 
> 
> 
> -Original Message- 
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
>>> Behalf Of Eugene Loh 
> Sent: Thursday, November 03, 2011 1:23 

Re: [OMPI users] Troubles using MPI_Isend/MPI_Irecv/MPI_Waitany and MPI_Allreduce

2011-11-04 Thread Jeff Squyres
Sorry for the delay in replying.

I think you need to use MPI_INIT_THREAD with a level of MPI_THREAD_MULTIPLE 
instead of MPI_INIT.  This sets up internal locking in Open MPI to protect 
against multiple threads inside the progress engine, etc.

Be aware that only some of Open MPI's transports are THREAD_MULTIPLE safe -- 
see the README for more detail.


On Oct 23, 2011, at 1:11 PM, Pedro Gonnet wrote:

> 
> Hi again,
> 
> As promised, I implemented a small program reproducing the error.
> 
> The program's main routine spawns a pthread which calls the function
> "exchange". "exchange" uses MPI_Isend/MPI_Irecv/MPI_Waitany to exchange
> a buffer of double-precision numbers with all other nodes.
> 
> At the same time, the "main" routine exchanges the sum of all the
> buffers using MPI_Allreduce.
> 
> To compile and run the program, do the following:
> 
>mpicc -g -Wall mpitest.c -pthread
>mpirun -np 8 ./a.out
> 
> Timing is, of course, of the essence and you may have to run the program
> a few times or twiddle with the value of "usleep" in line 146 for it to
> hang. To see where things go bad, you can do the following
> 
>mpirun -np 8 xterm -e gdb -ex run ./a.out
> 
> Things go bad when MPI_Allreduce is called while any of the threads are
> in MPI_Waitany. The value of "usleep" in line 146 should be long enough
> for all the nodes to have started exchanging data but small enough so
> that they are not done yet.
> 
> Cheers,
> Pedro
> 
> 
> 
> On Thu, 2011-10-20 at 11:25 +0100, Pedro Gonnet wrote:
>> Short update:
>> 
>> I just installed version 1.4.4 from source (compiled with
>> --enable-mpi-threads), and the problem persists.
>> 
>> I should also point out that if, in thread (ii), I wait for the
>> nonblocking communication in thread (i) to finish, nothing bad happens.
>> But this makes the nonblocking communication somewhat pointless.
>> 
>> Cheers,
>> Pedro
>> 
>> 
>> On Thu, 2011-10-20 at 10:42 +0100, Pedro Gonnet wrote:
>>> Hi all,
>>> 
>>> I am currently working on a multi-threaded hybrid parallel simulation
>>> which uses both pthreads and OpenMPI. The simulation uses several
>>> pthreads per MPI node.
>>> 
>>> My code uses the nonblocking routines MPI_Isend/MPI_Irecv/MPI_Waitany
>>> quite successfully to implement the node-to-node communication. When I
>>> try to interleave other computations during this communication, however,
>>> bad things happen.
>>> 
>>> I have two MPI nodes with two threads each: one thread (i) doing the
>>> nonblocking communication and the other (ii) doing other computations.
>>> At some point, the threads (ii) need to exchange data using
>>> MPI_Allreduce, which fails if the first thread (i) has not completed all
>>> the communication, i.e. if thread (i) is still in MPI_Waitany.
>>> 
>>> Using the in-place MPI_Allreduce, I get a re-run of this bug:
>>> http://www.open-mpi.org/community/lists/users/2011/09/17432.php. If I
>>> don't use in-place, the call to MPI_Waitany (thread ii) on one of the
>>> MPI nodes waits forever. 
>>> 
>>> My guess is that when the thread (ii) calls MPI_Allreduce, it gets
>>> whatever the other node sent with MPI_Isend to thread (i), drops
>>> whatever it should have been getting from the other node's
>>> MPI_Allreduce, and the call to MPI_Waitall hangs.
>>> 
>>> Is this a known issue? Is MPI_Allreduce not designed to work alongside
>>> the nonblocking routines? Is there a "safe" variant of MPI_Allreduce I
>>> should be using instead?
>>> 
>>> I am using OpenMPI version 1.4.3 (version 1.4.3-1ubuntu3 of the package
>>> openmpi-bin in Ubuntu). Both MPI nodes are run on the same dual-core
>>> computer (Lenovo x201 laptop).
>>> 
>>> If you need more information, please do let me know! I'll also try to
>>> cook-up a small program reproducing this problem...
>>> 
>>> Cheers and kind regards,
>>> Pedro
>>> 
>>> 
>>> 
>>> 
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Technical details of various MPI APIs

2011-11-04 Thread Jeff Squyres
Sorry for the delay in replying.

We don't have any formal documentation written up on this stuff, in part 
because we keep optimizing and changing the exact makeup of wire protocols, etc.

If you have any specific questions, we can try to answer them for you.

On Oct 21, 2011, at 2:45 PM, ramu wrote:

> Hi,
> I am trying to explore more on technical details of MPI APIs defined in 
> OpenMPI
> (for e.g., MPI_Init(), MPI_Barrier(), MPI_Send(), MPI_Recv(), MPI_Waitall(),
> MPI_Finalize() etc) when the MPI Processes are running on Infiniband cluster
> (OFED).  I mean, what are the messages exchanged between MPI processes over 
> IB,
> how does processes identify each other and what messages they exchange to
> identify and what all is needed to trigger data traffic.  Is there any 
> doc/link
> available which describes these details.  Please suggest me. 
> 
> Thanks & Regards,
> Ramu 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Problem-Bug with MPI_Intercomm_create()

2011-11-04 Thread Jeff Squyres
After some discussion on the devel list, I opened 
https://svn.open-mpi.org/trac/ompi/ticket/2904 to track the issue.


On Oct 25, 2011, at 12:08 PM, Ralph Castain wrote:

> FWIW: I have tracked this problem down. The fix is a little more complicated 
> then I'd like, so I'm going to have to ping some other folks to ensure we 
> concur on the approach before doing something.
> 
> On Oct 25, 2011, at 8:20 AM, Ralph Castain wrote:
> 
>> I still see it failing the test George provided on the trunk. I'm unaware of 
>> anyone looking further into it, though, as the prior discussion seemed to 
>> just end.
>> 
>> On Oct 25, 2011, at 7:01 AM, orel wrote:
>> 
>>> Dears,
>>> 
>>> I try from several days to use advanced MPI2 features in the following 
>>> scenario :
>>> 
>>> 1) a master code A (of size NPA) spawns (MPI_Comm_spawn()) two slave
>>>   codes B (of size NPB) and C (of size NPC), providing intercomms A-B and 
>>> A-C ;
>>> 2) i create intracomm AB and AC by merging intercomms ;
>>> 3) then i create intercomm AB-C by calling MPI_Intercomm_create() by using 
>>> AC as bridge...
>>> 
>>>  MPI_Comm intercommABC; A: MPI_Intercomm_create(intracommAB, 0, 
>>> intracommAC, NPA, TAG,&intercommABC);
>>> B: MPI_Intercomm_create(intracommAB, 0, MPI_COMM_NULL, 0,TAG,&intercommABC);
>>> C: MPI_Intercomm_create(intracommC, 0, intracommAC, 0, TAG,&intercommABC);
>>> 
>>>In these calls, A0 and C0 play the role of local leader for AB and C 
>>> respectively.
>>>C0 and A0 play the roles of remote leader in bridge intracomm AC.
>>> 
>>> 3)  MPI_Barrier(intercommABC);
>>> 4)  i merge intercomm AB-C into intracomm ABC$
>>> 5)  MPI_Barrier(intracommABC);
>>> 
>>> My BUG: These calls success, but when i try to use intracommABC for a 
>>> collective communication like MPI_Barrier(),
>>> i got the following error :
>>> 
>>> *** An error occurred in MPI_Barrier
>>> *** on communicator
>>> *** MPI_ERR_INTERN: internal error
>>> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>>> 
>>> 
>>> I try with OpenMPI trunk, 1.5.3, 1.5.4 and Mpich2-1.4.1p1
>>> 
>>> My code works perfectly if intracomm A, B and C are obtained by 
>>> MPI_Comm_split() instead of MPI_Comm_spawn() 
>>> 
>>> 
>>> I found same problem in a previous thread of the OMPI Users mailing list :
>>> 
>>> => http://www.open-mpi.org/community/lists/users/2011/06/16711.php
>>> 
>>> Is that bug/problem is currently under investigation ? :-)
>>> 
>>> i can give detailed code, but the one provided by George Bosilca in this 
>>> previous thread provides same error...
>>> 
>>> Thank you to help me...
>>> 
>>> -- 
>>> Aurélien Esnard
>>> University Bordeaux 1 / LaBRI / INRIA (France)
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] problem with mpirun

2011-11-04 Thread Jeff Squyres
We really need more information in order to help you.  Please see:

http://www.open-mpi.org/community/help/


On Nov 3, 2011, at 7:37 PM, amine mrabet wrote:

> i instaled  last version of openmpi now i have this error
> I
> t seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
> 
> :)
> 
> 
> 2011/11/3 amine mrabet 
> yes i have old version i will instal  1.4.4 and see 
> merci 
> 
> 
> 2011/11/3 Jeff Squyres 
> It sounds like you have an old version of Open MPI that is not ignoring your 
> unconfigured OpenFabrics devices in your Linux install.  This is a guess 
> because you didn't provide any information about your Open MPI installation.  
> :-)
> 
> Try upgrading to a newer version of Open MPI.
> 
> 
> On Nov 3, 2011, at 12:52 PM, amine mrabet wrote:
> 
> > i use openmpi in my computer
> >
> > 2011/11/3 Ralph Castain 
> > Couple of things:
> >
> > 1. Check the configure cmd line you gave - OMPI thinks your local computer 
> > should have an openib support that isn't correct.
> >
> > 2. did you recompile your app on your local computer, using the version of 
> > OMPI built/installed there?
> >
> >
> > On Nov 3, 2011, at 10:10 AM, amine mrabet wrote:
> >
> > > hey ,
> > > i use mpirun tu run program  with using mpi this program worked well in 
> > > university computer
> > >
> > > but with mine i have this error
> > >  i run with
> > >
> > > amine@dellam:~/Bureau$ mpirun  -np 2 pl
> > > and i have this error
> > >
> > > libibverbs: Fatal: couldn't read uverbs ABI version.
> > > --
> > > [0,0,0]: OpenIB on host dellam was unable to find any HCAs.
> > > Another transport will be used instead, although this may result in
> > > lower performance.
> > >
> > >
> > >
> > >
> > >
> > > any help?!
> > > --
> > > amine mrabet
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > amine mrabet
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> amine mrabet 
> 
> 
> 
> -- 
> amine mrabet 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread TERRY DONTJE
David, are you saying your jobs consistently leave behind session files 
after the job exits?  It really shouldn't even in the case when a job 
aborts, I thought, mpirun took great pains to cleanup after itself.
Can you tell us what version of OMPI you are running with?  I think I 
could see kill -9 of mpirun and processes below would cause turds to be 
left behind.


--td

On 11/4/2011 2:37 AM, David Turner wrote:

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:

Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node 
/tmp

for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:
Thanks for the help.  A couple follow-up-questions, maybe this 
starts to

go outside OpenMPI:


What's wrong with using /dev/shm?  I think you said earlier in this 
thread

that this was not a safe place.


If the NFS-mount point is moved from /tmp to /work, would a /tmp 
magically
appear in the filesystem for a stateless node?  How big would it be, 
given
that there is no local disk, right?  That may be something I have to 
ask the

vendor, which I've tried, but they don't quite seem to get the question.


Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Ralph Castain

Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less 
node /tmp

for OpenMPI usage



On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:

I might be missing something here. Is there a side-effect or 
performance
loss if you don't use the sm btl?  Why would it exist if there is a 
wholly

equivalent alternative?  What happens to traffic that is intended for
another process on the same node?


There is a definite performance impact, and we wouldn't recommend doing

what Eugene suggested if you care about performance.


The correct solution here is get your sys admin to make /tmp local. 
Making
/tmp NFS mounted across multiple nodes is a major "faux pas" in the 
Linux

world - it should never be done, for the reasons stated by Jeff.





Thanks


-Original Message-
From: users-boun...@open-mpi.org 
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a

session file on /tmp, which is NFS-mounted and thus not a good choice.


Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org 
[mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


I've not been following closely.  Why must one use shared-memory
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@o

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-04 Thread David Turner

% df /tmp
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /
% df /
Filesystem   1K-blocks  Used Available Use% Mounted on
- 12330084822848  11507236   7% /

That works out to 11GB.  But...

The compute nodes have 24GB.  Freshly booted, about 3.2GB is
consumed by the kernel, various services, and the root file system.
At this time, usage of /tmp is essentially nil.

We set user memory limits to 20GB.

I would imagine that the size of the session directories depends on a
number of factors; perhaps the developers can comment on that.  I have
only seen total sizes in the 10s of MBs on our 8-node, 24GB nodes.

As long as they're removed after each job, they don't really compete
with the application for available memory.

On 11/3/11 8:40 PM, Ed Blosch wrote:

Thanks very much, exactly what I wanted to hear. How big is /tmp?

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of David Turner
Sent: Thursday, November 03, 2011 6:36 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp
for OpenMPI usage

I'm not a systems guy, but I'll pitch in anyway.  On our cluster,
all the compute nodes are completely diskless.  The root file system,
including /tmp, resides in memory (ramdisk).  OpenMPI puts these
session directories therein.  All our jobs run through a batch
system (torque).  At the conclusion of each batch job, an epilogue
process runs that removes all files belonging to the owner of the
current batch job from /tmp (and also looks for and kills orphan
processes belonging to the user).  This epilogue had to written
by our systems staff.

I believe this is a fairly common configuration for diskless
clusters.

On 11/3/11 4:09 PM, Blosch, Edwin L wrote:

Thanks for the help.  A couple follow-up-questions, maybe this starts to

go outside OpenMPI:


What's wrong with using /dev/shm?  I think you said earlier in this thread

that this was not a safe place.


If the NFS-mount point is moved from /tmp to /work, would a /tmp magically

appear in the filesystem for a stateless node?  How big would it be, given
that there is no local disk, right?  That may be something I have to ask the
vendor, which I've tried, but they don't quite seem to get the question.


Thanks




-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Ralph Castain

Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp

for OpenMPI usage



On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:


I might be missing something here. Is there a side-effect or performance

loss if you don't use the sm btl?  Why would it exist if there is a wholly
equivalent alternative?  What happens to traffic that is intended for
another process on the same node?


There is a definite performance impact, and we wouldn't recommend doing

what Eugene suggested if you care about performance.


The correct solution here is get your sys admin to make /tmp local. Making

/tmp NFS mounted across multiple nodes is a major "faux pas" in the Linux
world - it should never be done, for the reasons stated by Jeff.





Thanks


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 1:23 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


Right.  Actually "--mca btl ^sm".  (Was missing "btl".)

On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:

I don't tell OpenMPI what BTLs to use. The default uses sm and puts a

session file on /tmp, which is NFS-mounted and thus not a good choice.


Are you suggesting something like --mca ^sm?


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Behalf Of Eugene Loh

Sent: Thursday, November 03, 2011 12:54 PM
To: us...@open-mpi.org
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node

/tmp for OpenMPI usage


I've not been following closely.  Why must one use shared-memory
communications?  How about using other BTLs in a "loopback" fashion?
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://w