Re: [OMPI users] Memory manager

2008-05-19 Thread Terry Frankcombe
To tell you all what noone wanted to tell me, yes, it does seem to be
the memory manager.  Compiling everything with
--with-memory-manager=none returns the vmem use to the more reasonable
~100MB per process (down from >8GB).

I take it this may affect my peak bandwidth over infiniband.  What's the
general feeling about how bad this is?


On Tue, 2008-05-13 at 13:12 +1000, Terry Frankcombe wrote:
> Hi folks
> 
> I'm trying to run an MPI app on an infiniband cluster with OpenMPI
> 1.2.6.
> 
> When run on a single node, this app is grabbing large chunks of memory
> (total per process ~8.5GB, including strace showing a single 4GB grab)
> but not using it.  The resident memory use is ~40MB per process.  When
> this app is compiled in serial mode (with conditionals to remove the MPI
> calls) the memory use is more like what you'd expect, 40MB res and
> ~100MB vmem.
> 
> Now I didn't write it so I'm not sure what extra stuff the MPI version
> does, and we haven't tracked down the large memory grabs.
> 
> Could it be that this vmem is being grabbed by the OpenMPI memory
> manager rather than directly by the app?
> 
> Ciao
> Terry
> 
> 



Re: [OMPI users] openmpi 32-bit g++ compilation issue

2008-05-19 Thread Doug Reeder

Arif,

It looks like your system is 64 bit by default and it therefore  
doesn't pick up the 32 bit libraries automatically at the link step  
(note the -L/.../x86_64-suse-linux/lib entries prior to the  
correspond entries pointing to the 32 bit library versions). I don't  
use suse linux so I don't know if this is something you can control  
in the configure step for open-mpi.


Doug Reeder
On May 19, 2008, at 2:48 PM, Arif Ali wrote:


Hi,

OS: SLES10 SP1
OFED: 1.3
openmpi: 1.2 1.2.5 1.2.6
compilers: gcc g++ gfortran

I am creating a 32-bit build of openmpi on an Infiniband cluster,  
and the compilation gets stuck, If I use the /usr/lib64/gcc/x86_64- 
suse-linux/4.1.2/32/libstdc++.so library manually it compiles that  
piece of code. I was wandering if anyone else has had this problem.  
Or is there any other way of getting this to work. I feel that  
there may be something very silly here that I have missed out. but  
I can't seem to gather it.


I have also tried this on a fresh install of OFED 1.3 with openmpi  
1.2.6



libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../../../opal/include - 
I../../../orte/include -I../../../ompi/include - 
DOMPI_BUILDING_CXX_BINDINGS_LIBRARY=1 -DOMPI_SKIP_MPICXX=1 - 
I../../.. -O3 -DNDEBUG -m32 -finline-functions -pthread -MT file.lo  
-MD -MP -MF .deps/file.Tpo -c file.cc  -fPIC -DPIC -o .libs/file.o

depbase=`echo win.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
/bin/sh ../../../libtool --tag=CXX   --mode=compile g++ - 
DHAVE_CONFIG_H -I. -I../../../opal/include -I../../../orte/include - 
I../../../ompi/include  -DOMPI_BUILDING_CXX_BINDINGS_LIBRARY=1 - 
DOMPI_SKIP_MPICXX=1 -I../../..-O3 -DNDEBUG -m32 -finline- 
functions -pthread -MT win.lo -MD -MP -MF $depbase.Tpo -c -o win.lo  
win.cc &&\

mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../../../opal/include - 
I../../../orte/include -I../../../ompi/include - 
DOMPI_BUILDING_CXX_BINDINGS_LIBRARY=1 -DOMPI_SKIP_MPICXX=1 - 
I../../.. -O3 -DNDEBUG -m32 -finline-functions -pthread -MT win.lo - 
MD -MP -MF .deps/win.Tpo -c win.cc  -fPIC -DPIC -o .libs/win.o
/bin/sh ../../../libtool --tag=CXX   --mode=link g++  -O3 -DNDEBUG - 
m32 -finline-functions -pthread  -export-dynamic -m32  -o  
libmpi_cxx.la -rpath /opt/openmpi/1.2.6/gnu_4.1.2/32/lib mpicxx.lo  
intercepts.lo comm.lo datatype.lo file.lo win.lo  -lnsl -lutil  -lm
libtool: link: g++ -shared -nostdlib /usr/lib64/gcc/x86_64-suse- 
linux/4.1.2/../../../../lib/crti.o /usr/lib64/gcc/x86_64-suse-linux/ 
4.1.2/32/crtbeginS.o  .libs/mpicxx.o .libs/intercepts.o .libs/ 
comm.o .libs/datatype.o .libs/file.o .libs/win.o   -Wl,-rpath -Wl,/ 
usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath -Wl,/usr/lib64/gcc/ 
x86_64-suse-linux/4.1.2 -lnsl -lutil -L/usr/lib64/gcc/x86_64-suse- 
linux/4.1.2/32 -L/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../ 
x86_64-suse-linux/lib/../lib -L/usr/lib64/gcc/x86_64-suse-linux/ 
4.1.2/../../../../lib -L/lib/../lib -L/usr/lib/../lib -L/usr/lib64/ 
gcc/x86_64-suse-linux/4.1.2 -L/usr/lib64/gcc/x86_64-suse-linux/ 
4.1.2/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc/x86_64- 
suse-linux/4.1.2/../../.. /usr/lib64/gcc/x86_64-suse-linux/4.1.2/ 
libstdc++.so -lm -lpthread -lc -lgcc_s /usr/lib64/gcc/x86_64-suse- 
linux/4.1.2/32/crtendS.o /usr/lib64/gcc/x86_64-suse-linux/ 
4.1.2/../../../../lib/crtn.o  -m32 -pthread -m32   -pthread -Wl,- 
soname -Wl,libmpi_cxx.so.0 -o .libs/libmpi_cxx.so.0.0.0
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/libstdc++.so: could not read  
symbols: File in wrong format

collect2: ld returned 1 exit status
--
Arif Ali
Software Engineer
OCF plc

Mobile: +44 (0)7970 148 122
DDI:+44 (0)114 257 2240
Office: +44 (0)114 257 2200
Fax:+44 (0)114 257 0022
Email:  a...@ocf.co.uk
Web:http://www.ocf.co.uk

Support Phone:   +44 (0)845 702 3829
Support E-mail:  supp...@ocf.co.uk

Skype:  arif_ali80
MSN:a...@ocf.co.uk

This email is confidential in that it is intended for the exclusive
attention of the addressee(s) indicated. If you are not the intended
recipient, this email should not be read or disclosed to any other
person. Please notify the sender immediately and delete this email  
from

your computer system. Any opinions expressed are not necessarily those
of the company from which this email was sent and, whilst to the  
best of

our knowledge no viruses or defects exist, no responsibility can be
accepted for any loss or damage arising from its receipt or subsequent
use of this email.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] openmpi 32-bit g++ compilation issue

2008-05-19 Thread Arif Ali
Hi,

OS: SLES10 SP1
OFED: 1.3
openmpi: 1.2 1.2.5 1.2.6
compilers: gcc g++ gfortran

I am creating a 32-bit build of openmpi on an Infiniband cluster, and the 
compilation gets stuck, If I use the 
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/32/libstdc++.so library manually it 
compiles that piece of code. I was wandering if anyone else has had this 
problem. Or is there any other way of getting this to work. I feel that there 
may be something very silly here that I have missed out. but I can't seem to 
gather it.

I have also tried this on a fresh install of OFED 1.3 with openmpi 1.2.6


libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../../../opal/include 
-I../../../orte/include -I../../../ompi/include 
-DOMPI_BUILDING_CXX_BINDINGS_LIBRARY=1 -DOMPI_SKIP_MPICXX=1 -I../../.. -O3 
-DNDEBUG -m32 -finline-functions -pthread -MT file.lo -MD -MP -MF 
.deps/file.Tpo -c file.cc  -fPIC -DPIC -o .libs/file.o
depbase=`echo win.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`;\
/bin/sh ../../../libtool --tag=CXX   --mode=compile g++ -DHAVE_CONFIG_H -I. 
-I../../../opal/include -I../../../orte/include -I../../../ompi/include  
-DOMPI_BUILDING_CXX_BINDINGS_LIBRARY=1 -DOMPI_SKIP_MPICXX=1 -I../../..-O3 
-DNDEBUG -m32 -finline-functions -pthread -MT win.lo -MD -MP -MF $depbase.Tpo 
-c -o win.lo win.cc &&\
mv -f $depbase.Tpo $depbase.Plo
libtool: compile:  g++ -DHAVE_CONFIG_H -I. -I../../../opal/include 
-I../../../orte/include -I../../../ompi/include 
-DOMPI_BUILDING_CXX_BINDINGS_LIBRARY=1 -DOMPI_SKIP_MPICXX=1 -I../../.. -O3 
-DNDEBUG -m32 -finline-functions -pthread -MT win.lo -MD -MP -MF .deps/win.Tpo 
-c win.cc  -fPIC -DPIC -o .libs/win.o
/bin/sh ../../../libtool --tag=CXX   --mode=link g++  -O3 -DNDEBUG -m32 
-finline-functions -pthread  -export-dynamic -m32  -o libmpi_cxx.la -rpath 
/opt/openmpi/1.2.6/gnu_4.1.2/32/lib mpicxx.lo intercepts.lo comm.lo datatype.lo 
file.lo win.lo  -lnsl -lutil  -lm 
libtool: link: g++ -shared -nostdlib 
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../lib/crti.o 
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/32/crtbeginS.o  .libs/mpicxx.o 
.libs/intercepts.o .libs/comm.o .libs/datatype.o .libs/file.o .libs/win.o   
-Wl,-rpath -Wl,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath 
-Wl,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -lnsl -lutil 
-L/usr/lib64/gcc/x86_64-suse-linux/4.1.2/32 
-L/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../x86_64-suse-linux/lib/../lib
 -L/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../lib -L/lib/../lib 
-L/usr/lib/../lib -L/usr/lib64/gcc/x86_64-suse-linux/4.1.2 
-L/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../x86_64-suse-linux/lib 
-L/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../.. 
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/libstdc++.so -lm -lpthread -lc -lgcc_s 
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/32/crtendS.o 
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/../../../../lib/crtn.o  -m32 -pthread 
-m32   -pthread -Wl,-soname -Wl,libmpi_cxx.so.0 -o .libs/libmpi_cxx.so.0.0.0
/usr/lib64/gcc/x86_64-suse-linux/4.1.2/libstdc++.so: could not read symbols: 
File in wrong format
collect2: ld returned 1 exit status

-- 
Arif Ali
Software Engineer
OCF plc

Mobile: +44 (0)7970 148 122
DDI:+44 (0)114 257 2240
Office: +44 (0)114 257 2200
Fax:+44 (0)114 257 0022
Email:  a...@ocf.co.uk
Web:http://www.ocf.co.uk

Support Phone:   +44 (0)845 702 3829
Support E-mail:  supp...@ocf.co.uk

Skype:  arif_ali80
MSN:a...@ocf.co.uk

This email is confidential in that it is intended for the exclusive 
attention of the addressee(s) indicated. If you are not the intended 
recipient, this email should not be read or disclosed to any other 
person. Please notify the sender immediately and delete this email from 
your computer system. Any opinions expressed are not necessarily those 
of the company from which this email was sent and, whilst to the best of
our knowledge no viruses or defects exist, no responsibility can be 
accepted for any loss or damage arising from its receipt or subsequent 
use of this email. 


Re: [OMPI users] "Sorry! You were supposed to get help about..."

2008-05-19 Thread Jeff Squyres
It feels like OMPI is somehow looking for the help files in the wrong  
place.  Were they moved after OMPI was installed?  How did you install  
OMPI?



On May 16, 2008, at 10:30 AM, Alex L. wrote:



Hello Everybody,

i got a little bit annoying situation with OMPI error messages
on a RHEL 4-something box. Every time i should see a error
message i recieve something like:

-
Sorry!  You were supposed to get help about:
   orterun:init-failure
from the file:
   help-orterun.txt
But I couldn't find any file matching that name.  Sorry!
-

i know that the help-files (and only the help-files not
the whole installation) are located in:
 /usr/share/openmpi/1.2.3-gcc/help64/openmpi/*

Is it possible to tell OMPI to look for the help files
in this direcotry ? Some ENV variable or a --option ?

Thank you in advance, Alex

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [MTT users] MTT server side problem

2008-05-19 Thread Pavel Shamis (Pasha)

Hello,
Did you have chance to review this patch ?

Regards,
Pasha

Josh Hursey wrote:
Sorry for the delay on this. I probably will not have a chance to look 
at it until later this week or early next. Thank you for the work on 
the patch.


Cheers,
Josh

On May 12, 2008, at 8:08 AM, Pavel Shamis (Pasha) wrote:


Hi Josh,
I ported the error handling mechanism from submit/index.php to to the 
database.inc. Please review.


Thanks,
Pasha

Josh Hursey wrote:

Pasha,

I'm looking at the patch a bit closer and even though at a high 
level the do_pg_connect, do_pg_query, simple_select, and select 
functions do the same thing the versions in submit/index.php have 
some additional error handling mechanisms that the ones in 
database.inc do not have. Specifically they send email when the 
functions fail with messages indicating what failed so corrections 
can be made.


So though I agree that we should unify the functionality I cannot 
recommend this patch since it will result in losing useful error 
handling functionality. Maybe there is another way to clean this up 
to preserve the error reporting.


-- Josh

On May 7, 2008, at 11:56 AM, Pavel Shamis (Pasha) wrote:


Hi Josh,
I had the original problem with some old revision from trunk.
Today I updated the server to latest revision from trunk + the 
patch and everything looks good.


Can I commit the patch ?

Pasha


Ethan Mallove wrote:

On Wed, May/07/2008 06:04:07PM, Pavel Shamis (Pasha) wrote:


Hi Josh.

Looking at the patch I'm a little bit conserned. The 
"get_table_fields()" is, as you mentioned, no longer used so 
should be removed. However the other functions are critical to 
the submission script particularly 'do_pg_connect' which opens 
the connection to the backend database.


All the functions  are implemented in $topdir/database.inc file. 
And the "database.inc" implementation is better because it use 
password and username from config.ini. The original  
implementation from submit/index use

hardcoded values defined in the file.

Are you using the current development trunk (mtt/trunk) or the 
stable release branch (mtt/branches/ompi-core-testers)?



trunk


Can you send us the error messages that you were receiving?

1. On client side I see ""*** WARNING: MTTDatabase client did not 
get a serial"
As result of the error some of MTT results is not visible via the 
web reporter

2. On server side I found follow error message:
[client 10.4.3.214] PHP Fatal error:  Allowed memory size of 
33554432 bytes exhausted (tried to allocate 23592960
bytes) in /.autodirect/swgwork/MTT/mtt/submit/index.php(79) : 
eval()'d code on line 77515

[Mon May 05 19:26:05 2008] [notice] caught SIGTERM, shutting down
[Mon May 05 19:30:54 2008] [notice] suEXEC mechanism enabled 
(wrapper: /usr/sbin/suexec)
[Mon May 05 19:30:54 2008] [notice] Digest: generating secret for 
digest authentication ...

[Mon May 05 19:30:54 2008] [notice] Digest: done
[Mon May 05 19:30:54 2008] [notice] LDAP: Built with OpenLDAP 
LDAP SDK

[Mon May 05 19:30:54 2008] [notice] LDAP: SSL support unavailable
My memory limit in php.ini file was set on 256MB !




Looks like PHP is actually using a 32MB limit ("Allowed
memory size of 33554432 ..."). Does a (Apache?) daemon need
to be restarted for the php.ini file to take effect? To
check your settings, this little PHP script will print an
HTML page of all the active system settings (search on
"memory_limit").


-Ethan




Regards,
Pasha



Cheers,
Josh

On May 7, 2008, at 4:49 AM, Pavel Shamis (Pasha) wrote:



Hi,
I upgraded the server side (the mtt is still running , so don't 
know if the problem was resolved)
During upgrade I had some problem with the submit/index.php 
script, it had some duplicated functions and some of them were 
broken.

Please review the attached patch.

Pasha

Ethan Mallove wrote:


On Tue, May/06/2008 06:29:33PM, Pavel Shamis (Pasha) wrote:



I'm not sure which cron jobs you're referring to. Do you
mean these?

https://svn.open-mpi.org/trac/mtt/browser/trunk/server/php/cron


I talked about this one: 
https://svn.open-mpi.org/trac/mtt/wiki/ServerMaintenance




I'm guessing you would only be concerned with the below
periodic-maintenance.pl script, which just runs
ANALYZE/VACUUM queries. I think you can start that up
whenever you want (and it should optimize the Reporter).

https://svn.open-mpi.org/trac/mtt/browser/trunk/server/sql/cron/periodic-maintenance.pl 



-Ethan





The only thing there are the regular
mtt-resu...@open-mpi.org email alerts and some out-of-date
DB monitoring junk. You can ignore that stuff.

Josh, are there some nightly (DB
pruning/cleaning/vacuuming?) cron jobs that Pasha should be
running?

-Ethan




Thanks.

Ethan Mallove wrote:



Hi Pasha,

I thought this issue was solved in r1119 (see below). Do you
have the latest mtt/server scripts?

https://svn.open-mpi.org/trac/mtt/changeset/1119/trunk/server/php/submit 



-Ethan

On Tue, May/06/2008 03:26:43PM, Pavel Shamis (Pasha) wrote:



About the