Re: [OMPI devel] More memory troubles with vapi

2007-08-27 Thread Jeff Squyres

On Aug 24, 2007, at 11:05 PM, Josh Aune wrote:


Hmm.  If you compile Open MPI with no memory manager, then it
*shouldn't* be Open MPI's fault (unless there's a leak in the mvapi
BTL...?).  Verify that you did not actually compile Open MPI with a
memory manager by running "ompi_info| grep ptmalloc2" -- it should
come up empty.


I am sure.  I have multiple builds that I switch between.  One of the
apps doesn't work unless I --without-memory-manager (see post to
-users about realloc(), with sample code).


Ok.


I noticed that there are a few ./configure --debug type switches, even
some dealing with memory.  Could those be useful for gathering further
data?  What features do those provide and how do I use them?


If you use --enable-mem-debug, they force all internal calls to malloc 
(), free(), and calloc() to go through our own internal functions,  
but those mainly just check that we don't pass bad parameters such as  
NULL, etc.  I suppose you could put in some memory profiling or  
something, but that would probably get pretty sticky.  :-(



The fact that you can run this under TCP without memory leaking would
seem to indicate that it's not the app that's leaking memory, but
rather either the MPI or the network stack.


I should clarify here, this is effectively true.  The app crashes from
a segfault after running over tcp for several hours, but it gets much
farther into the run than the vapi btl does.


Yuck.  :-(  I assume there's no easy way to track this down -- do you  
get a corefile?  Can you see where the app died -- are there any  
obvious indexes going out of range of array bounds, etc.?  Is it in  
MPI or in the application?


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] thread model

2007-08-27 Thread Jeff Squyres

On Aug 27, 2007, at 2:50 PM, Greg Watson wrote:


Until now I haven't had to worry about the opal/orte thread model.
However, there are now people who would like to use ompi that has
been configured with --with-threads=posix and --with-enable-mpi-
threads. Can someone give me some pointers as to what I need to do in
order to make sure I don't violate any threading model?


Note that this is *NOT* well tested.  There is work going on right  
now to make the OMPI layer be able to support MPI_THREAD_MULTIPLE  
(support was designed in from the beginning, but we haven't ever done  
any kind of comprehensive testing/stressing of multi-thread support  
such that it is pretty much guaranteed not to work), but it is  
occurring on the trunk (i.e., what will eventually become v1.3) --  
not the v1.2 branch.



The interfaces I'm calling are:

opal_event_loop()


Brian or George will have to answer about that one...


opal_path_findv()


This guy should be multi-thread safe (disclaimer: haven't tested it  
myself); it doesn't rely on any global state.



orte_init()
orte_ns.create_process_name()
orte_iof.iof_subscribe()
orte_iof.iof_unsubscribe()
orte_schema.get_job_segment_name()
orte_gpr.get()
orte_dss.get()
orte_rml.send_buffer()
orte_rmgr.spawn_job()
orte_pls.terminate_job()
orte_rds.query()
orte_smr.job_stage_gate_subscribe()
orte_rmgr.get_vpid_range()


Note that all of ORTE is *NOT* thread safe, nor is it planned to be  
(it just seemed way more trouble than it was worth).  You need to  
serialize access to it.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] &find() broken?

2007-08-27 Thread Jeff Squyres
Whoops -- wrong list; meant to send this to mtt-devel... sorry  
folks... nothing to see here...


On Aug 27, 2007, at 7:38 PM, Jeff Squyres wrote:


Ethan --

You said to me in IM:

"i'm getting stuck trying to use MTT::Functions::find. it's returning
EVERY file under the directory i give it."

Can you cite a specific example?  Is this on the jms-new-parser  
branch?


Keep in mind that you need to supply a *perl* regexp (not a shell
regexp).  For example:

 argv = -i &find("coll_.+.ski", "input_files")

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] &find() broken?

2007-08-27 Thread Jeff Squyres

Ethan --

You said to me in IM:

"i'm getting stuck trying to use MTT::Functions::find. it's returning  
EVERY file under the directory i give it."


Can you cite a specific example?  Is this on the jms-new-parser branch?

Keep in mind that you need to supply a *perl* regexp (not a shell  
regexp).  For example:


argv = -i &find("coll_.+.ski", "input_files")

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Trunk issue?

2007-08-27 Thread Ralf Wildenhues
Hello,

* Jeff Squyres wrote on Mon, Aug 27, 2007 at 04:07:22PM CEST:
> On Aug 27, 2007, at 9:23 AM, Ralph H Castain wrote:
> >
> > Making all in mca/timer/darwin
> > make[2]: Nothing to be done for `all'.
> > Making all in .
> > make[2]: *** No rule to make target `../opal/libltdl/libltdlc.la',  
> > needed by
> > `libopen-pal.la'.  Stop.

> Yes, if you're using --disable-dlopen, then libltdlc should not be  
> linked in (because it [rightfully] won't exist).

FWIW, I can reproduce the error, I don't yet know who's at fault
(but if it turns out to be Libtool, I'd appreciate a report), but
I noted this unrelated nit in the configury.  I guess you could
try setting LIBLTDL to '' in the case where you don't want to
build it.

Cheers,
Ralf

Index: configure.ac
===
--- configure.ac(revision 15970)
+++ configure.ac(working copy)
@@ -1052,7 +1052,7 @@
 AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h],
 [OPAL_HAVE_LTDL_ADVISE=1],
 [OPAL_HAVE_LTDL_ADVISE=0])
-CPPFLAGS="$CPPFLAGS"
+CPPFLAGS="$CPPFLAGS_save"

 # Arrgh.  This is gross.  But I can't think of any other way to do
 # it.  :-(


Re: [OMPI devel] Maximum Shared Memory Segment - OK to increase?

2007-08-27 Thread Richard Graham
Rolf,
  Would it be better to put this parameter in the system configuration file,
rather than change the compile time option ?

Rich



On 8/27/07 3:10 PM, "Rolf vandeVaart"  wrote:

> We are running into a problem when running on one of our larger SMPs
> using the latest Open MPI v1.2 branch.  We are trying to run a job
> with np=128 within a single node.  We are seeing the following error:
> 
> "SM failed to send message due to shortage of shared memory."
> 
> We then increased the allowable maximum size of the shared segment to
> 2Gigabytes-1 which is the maximum allowed on 32-bit application.  We
> used the mca parameter to increase it as shown here.
> 
> -mca mpool_sm_max_size 2147483647
> 
> This allowed the program to run to completion.  Therefore, we would
> like to increase the default maximum from 512Mbytes to 2G-1 Gigabytes.
> Does anyone have an objection to this change?  Soon we are going to
> have larger CPU counts and would like to increase the odds that things
> work "out of the box" on these large SMPs.
> 
> On a side note, I did a quick comparison of the shared memory needs of
> the old Sun ClusterTools to Open MPI and came up with this table.
>  
>  Open MPI
> np  Sun ClusterTools 6current   suggested
> -
>   2 20M  128M128M
>   4 20M  128M128M
>   8 22M  256M256M
>  16 27M  512M512M
>  32 48M  512M  1G
>  64133M  512M2G-1
> 128476M  512M2G-1
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



[OMPI devel] Maximum Shared Memory Segment - OK to increase?

2007-08-27 Thread Rolf vandeVaart

We are running into a problem when running on one of our larger SMPs
using the latest Open MPI v1.2 branch.  We are trying to run a job
with np=128 within a single node.  We are seeing the following error:

"SM failed to send message due to shortage of shared memory."

We then increased the allowable maximum size of the shared segment to
2Gigabytes-1 which is the maximum allowed on 32-bit application.  We
used the mca parameter to increase it as shown here.

-mca mpool_sm_max_size 2147483647

This allowed the program to run to completion.  Therefore, we would
like to increase the default maximum from 512Mbytes to 2G-1 Gigabytes.
Does anyone have an objection to this change?  Soon we are going to
have larger CPU counts and would like to increase the odds that things
work "out of the box" on these large SMPs.

On a side note, I did a quick comparison of the shared memory needs of
the old Sun ClusterTools to Open MPI and came up with this table.

Open MPI
np  Sun ClusterTools 6current   suggested
-
 2 20M  128M128M
 4 20M  128M128M
 8 22M  256M256M
16 27M  512M512M
32 48M  512M  1G
64133M  512M2G-1
128476M  512M2G-1



[OMPI devel] thread model

2007-08-27 Thread Greg Watson

Hi,

Until now I haven't had to worry about the opal/orte thread model.  
However, there are now people who would like to use ompi that has  
been configured with --with-threads=posix and --with-enable-mpi- 
threads. Can someone give me some pointers as to what I need to do in  
order to make sure I don't violate any threading model?


The interfaces I'm calling are:

opal_event_loop()
opal_path_findv()
orte_init()
orte_ns.create_process_name()
orte_iof.iof_subscribe()
orte_iof.iof_unsubscribe()
orte_schema.get_job_segment_name()
orte_gpr.get()
orte_dss.get()
orte_rml.send_buffer()
orte_rmgr.spawn_job()
orte_pls.terminate_job()
orte_rds.query()
orte_smr.job_stage_gate_subscribe()
orte_rmgr.get_vpid_range()

Thanks,

Greg



Re: [OMPI devel] MTT Database and Reporter Upgrade **Action Required**

2007-08-27 Thread Josh Hursey
Just wanted to let everyone know that the server upgrade went well.  
It is currently up and running. Feel free to submit your MTT tests as  
usual.


Cheers,
Josh

On Aug 24, 2007, at 1:45 PM, Jeff Squyres wrote:


FYI.  The MTT database will be down for a few hours on Monday
morning.  It'll be replaced with a much mo'better version -- [much]
faster than it was before.  Details below.


Begin forwarded message:


From: Josh Hursey 
Date: August 24, 2007 1:37:18 PM EDT
To: General user list for the MPI Testing Tool 
Subject: [MTT users] MTT Database and Reporter Upgrade **Action
Required**
Reply-To: General user list for the MPI Testing Tool 

Short Version:
--
The MTT development group is rolling out newly optimized web frontend
and backend database. As a result we will be taking down the MTT site
at IU Monday, August 27 from 8 am to Noon US eastern time.

During this time you will not be able to submit data to the MTT
database. Therefore you need to disable any runs that will report
during this time or your client will fail with unable to connect to
server messages.

This change does not affect the client configurations, so MTT users
do *not* need to update their clients at this time.


Longer Version:
---
The MTT development team has been working diligently on server side
optimizations over the past few months. This work involved major
changes to the database schema, web reporter, and web submit
components of the server.

We want to roll out the new server side optimizations on Monday, Aug.
27. Given the extensive nature of the improvements the MTT server
will need to be taken down for a few hours for this upgrade to take
place. We are planning on taking down the MTT server at 8 am and
we hope to have it back by Noon US Eastern time.

MTT users that would normally submit results during this time range
will need to disable their runs, or they will see server error
messages during this outage.

This upgrade does not require any client changes, so outside of the
down time contributors need not change or upgrade their MTT
installations.

Below are a few rough performance numbers illustrating the difference
between the old and new server versions as seen by the reporter.

Summary report: 24 hours, all orgs
 87 sec - old version
  6 sec - new version
Summary report: 24 hours, org = 'iu'
 37 sec - old
  4 sec - new
Summary report: Past 3 days, all orgs
138 sec - old
  9 sec - new
Summary report: Past 3 days, org = 'iu'
 49 sec - old
 11 sec - new
Summary report: Past 2 weeks, all orgs
863 sec - old
 34 sec - new
Summary report: Past 2 weeks, org = 'iu'
878 sec - old
 12 sec - new
Summary report: Past 1 month, all org
   1395 sec - old
158 sec - new
Summary report: Past 1 month, org = 'iu'
   1069 sec - old
 39 sec - new
Summary report: (2007-06-18 - 2007-06-19), all org
484 sec - old
  5 sec - new
Summary report: (2007-06-18 - 2007-06-19), org = 'iu'
479 sec - old
  2 sec - new

___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Trunk issue?

2007-08-27 Thread Jeff Squyres
Yes, if you're using --disable-dlopen, then libltdlc should not be  
linked in (because it [rightfully] won't exist).


I can reproduce the problem on my MBP.

Brian -- did something change here recently?


On Aug 27, 2007, at 9:23 AM, Ralph H Castain wrote:


Yo folks

Just checked out a fresh copy of the trunk and tried to build it  
using my

usual configure:

./configure --prefix=/Users/rhc/openmpi --with-devel-headers
--disable-shared --enable-static --disable-mpi-f77 --disable-mpi-f90
--enable-mem-debug --without-memory-manager --enable-debug
--disable-progress-threads --disable-mpi-threads --disable-io-romio
--without-threads --disable-dlopen


Got this error:

Making all in mca/timer/darwin
make[2]: Nothing to be done for `all'.
Making all in .
make[2]: *** No rule to make target `../opal/libltdl/libltdlc.la',  
needed by

`libopen-pal.la'.  Stop.
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1


It looks like some change may have broken one of these options?

Ralph



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-27 Thread Tim Prins

Ralph,

Ralph H Castain wrote:

Just returned from vacation...sorry for delayed response
No Problem. Hope you had a good vacation :) And sorry for my super 
delayed response. I have been pondering this a bit.


In the past, I have expressed three concerns about the RSL. 



My bottom line recommendation: I have no philosophical issue with the RSL
concept. However, I recommend holding off until the next version of ORTE is
completed and then re-evaluating to see how valuable the RSL might be, as
that next version will include memory footprint reduction and framework
consolidation that may yield much of the RSL's value without the extra work.


Long version:

1. What problem are we really trying to solve?
If the RSL is intended to solve the Cray support problem (where the Cray OS
really just wants to see OMPI, not ORTE), then it may have some value. The
issue to date has revolved around the difficulty of maintaining the Cray
port in the face of changes to ORTE - as new frameworks are added, special
components for Cray also need to be created to provide a "do-nothing"
capability. In addition, the Cray is memory constrained, and the ORTE
library occupies considerable space while providing very little
functionality.

This is definitely a motivation, but not the only one.


The degree of value provide by the RSL will therefore depend somewhat on the
efficacy of the changes in development within ORTE. Those changes will,
among other things, significantly consolidate and reduce the number of
frameworks, and reduce the memory footprint. The expectation is that the
result will require only a single CNOS component in one framework. It isn't
clear, therefore, that the RSL will provide a significant value in that
environment.

But won't there still be a lot of orte code linked in that will never be
used?

Also, a RSL would simplify ORTE in that there would be no need to do
anything special for CNOs in it.



If the RSL is intended to aid in ORTE development, as hinted at in the RFC,
then I believe that is questionable. Developing ORTE in a tmp branch has
proven reasonably effective as changes to the MPI layer are largely
invisible to ORTE. Creating another layer to the system that would also have
to be maintained seems like a non-productive way of addressing any problems
in that area.

Whether or not it would help in orte development remains to be seen. I
just say that it might. Although I would argue that developing in tmp
branches has caused a lot of problems with merging, etc.


If the RSL is intended as a means of "freezing" the MPI-RTE interface, then
I believe we could better attain that objective by simply defining a set of
requirements for the RTE. As I'll note below, freezing the interface at an
API level could negatively impact other Open MPI objectives.

It is intended to easily allow the development and use of other runtime
systems, so simply defining requirements is not enough.


2. Who is going to maintain old RTE versions, and why?
It isn't clear to me why anyone would want to do this - are we seriously
proposing that we maintain support for the ORTE layer that shipped with Open
MPI 1.0?? Can someone explain why we would want to do that?

I highly doubt anyone would, and see no reason to include support for
older runtime versions. Again, the purpose is to be able to run
different runtimes. The ability to run different versions of the same
runtime is just a side-effect.




3. Are we constraining ourselves from further improvements in startup
performance?
This is my biggest area of concern. The RSL has been proposed as an
API-level definition. However, the MPI-RTE interaction really is defined in
terms of a flow-of-control - although each point of interaction is
instantiated as an API, the fact is that what happens at that point is not
independent of all prior interactions.

As an example of my concern, consider what we are currently doing with ORTE.
The latest change in requirements involves the need to significantly improve
startup time, reduce memory footprint, and reduce ORTE complexity. What we
are doing to meet that requirement is to review the delineation of
responsibilities between the MPI and RTE layers. The current delineation
evolved over time, with many of the decisions made at a very early point in
the program. For example, we instituted RTE-level stage gates in the MPI
layer because, at the time they were needed, the MPI developers didn't want
to deal with them on their side (e.g., ensuring that failure of one proc
wouldn't hang the system). Given today's level of maturity in the MPI layer,
we are now planning on moving the stage gates to the MPI layer, implemented
as an "all-to-all" - this will remove several thousand lines of code from
ORTE and make it easier for the MPI layer to operate on non-ORTE
environments.

Similar efforts are underway to reduce ORTE involvement in the modex
operation and other parts of the MPI application lifecycle. We are able to
do these things because we are now

[OMPI devel] Trunk issue?

2007-08-27 Thread Ralph H Castain
Yo folks

Just checked out a fresh copy of the trunk and tried to build it using my
usual configure:

./configure --prefix=/Users/rhc/openmpi --with-devel-headers
--disable-shared --enable-static --disable-mpi-f77 --disable-mpi-f90
--enable-mem-debug --without-memory-manager --enable-debug
--disable-progress-threads --disable-mpi-threads --disable-io-romio
--without-threads --disable-dlopen


Got this error:

Making all in mca/timer/darwin
make[2]: Nothing to be done for `all'.
Making all in .
make[2]: *** No rule to make target `../opal/libltdl/libltdlc.la', needed by
`libopen-pal.la'.  Stop.
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1


It looks like some change may have broken one of these options?

Ralph





Re: [OMPI devel] Minor bug: sattach gives bad advice

2007-08-27 Thread Manuel Prinz
Am Montag, den 27.08.2007, 08:07 -0400 schrieb Jeff Squyres:
> Did you mean to send this to the SLURM list?
> :-)

Yes, I did. Sorry! It's one of those days... :-/

Best regards
Manuel



Re: [OMPI devel] Minor bug: sattach gives bad advice

2007-08-27 Thread Jeff Squyres

Did you mean to send this to the SLURM list?

:-)


On Aug 27, 2007, at 4:46 AM, Manuel Prinz wrote:


Hi everyone,

I noticed a very minor issue with sattach: If you pass an option it
doesn't understand, it asks you to look at "sbatch --help" which is a
little confusing:

$ sattach -X
sattach: invalid option -- X
Try "sbatch --help" for more information

I didn't find the right place in the source to provide a patch, sorry!
(And I hope this is the right list for bugs.)

Best regards
Manuel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] Minor bug: sattach gives bad advice

2007-08-27 Thread Manuel Prinz
Hi everyone,

I noticed a very minor issue with sattach: If you pass an option it
doesn't understand, it asks you to look at "sbatch --help" which is a
little confusing:

$ sattach -X
sattach: invalid option -- X
Try "sbatch --help" for more information

I didn't find the right place in the source to provide a patch, sorry!
(And I hope this is the right list for bugs.)

Best regards
Manuel