Re: [OMPI devel] More memory troubles with vapi
On Aug 24, 2007, at 11:05 PM, Josh Aune wrote: Hmm. If you compile Open MPI with no memory manager, then it *shouldn't* be Open MPI's fault (unless there's a leak in the mvapi BTL...?). Verify that you did not actually compile Open MPI with a memory manager by running "ompi_info| grep ptmalloc2" -- it should come up empty. I am sure. I have multiple builds that I switch between. One of the apps doesn't work unless I --without-memory-manager (see post to -users about realloc(), with sample code). Ok. I noticed that there are a few ./configure --debug type switches, even some dealing with memory. Could those be useful for gathering further data? What features do those provide and how do I use them? If you use --enable-mem-debug, they force all internal calls to malloc (), free(), and calloc() to go through our own internal functions, but those mainly just check that we don't pass bad parameters such as NULL, etc. I suppose you could put in some memory profiling or something, but that would probably get pretty sticky. :-( The fact that you can run this under TCP without memory leaking would seem to indicate that it's not the app that's leaking memory, but rather either the MPI or the network stack. I should clarify here, this is effectively true. The app crashes from a segfault after running over tcp for several hours, but it gets much farther into the run than the vapi btl does. Yuck. :-( I assume there's no easy way to track this down -- do you get a corefile? Can you see where the app died -- are there any obvious indexes going out of range of array bounds, etc.? Is it in MPI or in the application? -- Jeff Squyres Cisco Systems
Re: [OMPI devel] thread model
On Aug 27, 2007, at 2:50 PM, Greg Watson wrote: Until now I haven't had to worry about the opal/orte thread model. However, there are now people who would like to use ompi that has been configured with --with-threads=posix and --with-enable-mpi- threads. Can someone give me some pointers as to what I need to do in order to make sure I don't violate any threading model? Note that this is *NOT* well tested. There is work going on right now to make the OMPI layer be able to support MPI_THREAD_MULTIPLE (support was designed in from the beginning, but we haven't ever done any kind of comprehensive testing/stressing of multi-thread support such that it is pretty much guaranteed not to work), but it is occurring on the trunk (i.e., what will eventually become v1.3) -- not the v1.2 branch. The interfaces I'm calling are: opal_event_loop() Brian or George will have to answer about that one... opal_path_findv() This guy should be multi-thread safe (disclaimer: haven't tested it myself); it doesn't rely on any global state. orte_init() orte_ns.create_process_name() orte_iof.iof_subscribe() orte_iof.iof_unsubscribe() orte_schema.get_job_segment_name() orte_gpr.get() orte_dss.get() orte_rml.send_buffer() orte_rmgr.spawn_job() orte_pls.terminate_job() orte_rds.query() orte_smr.job_stage_gate_subscribe() orte_rmgr.get_vpid_range() Note that all of ORTE is *NOT* thread safe, nor is it planned to be (it just seemed way more trouble than it was worth). You need to serialize access to it. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] &find() broken?
Whoops -- wrong list; meant to send this to mtt-devel... sorry folks... nothing to see here... On Aug 27, 2007, at 7:38 PM, Jeff Squyres wrote: Ethan -- You said to me in IM: "i'm getting stuck trying to use MTT::Functions::find. it's returning EVERY file under the directory i give it." Can you cite a specific example? Is this on the jms-new-parser branch? Keep in mind that you need to supply a *perl* regexp (not a shell regexp). For example: argv = -i &find("coll_.+.ski", "input_files") -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
[OMPI devel] &find() broken?
Ethan -- You said to me in IM: "i'm getting stuck trying to use MTT::Functions::find. it's returning EVERY file under the directory i give it." Can you cite a specific example? Is this on the jms-new-parser branch? Keep in mind that you need to supply a *perl* regexp (not a shell regexp). For example: argv = -i &find("coll_.+.ski", "input_files") -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Trunk issue?
Hello, * Jeff Squyres wrote on Mon, Aug 27, 2007 at 04:07:22PM CEST: > On Aug 27, 2007, at 9:23 AM, Ralph H Castain wrote: > > > > Making all in mca/timer/darwin > > make[2]: Nothing to be done for `all'. > > Making all in . > > make[2]: *** No rule to make target `../opal/libltdl/libltdlc.la', > > needed by > > `libopen-pal.la'. Stop. > Yes, if you're using --disable-dlopen, then libltdlc should not be > linked in (because it [rightfully] won't exist). FWIW, I can reproduce the error, I don't yet know who's at fault (but if it turns out to be Libtool, I'd appreciate a report), but I noted this unrelated nit in the configury. I guess you could try setting LIBLTDL to '' in the case where you don't want to build it. Cheers, Ralf Index: configure.ac === --- configure.ac(revision 15970) +++ configure.ac(working copy) @@ -1052,7 +1052,7 @@ AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h], [OPAL_HAVE_LTDL_ADVISE=1], [OPAL_HAVE_LTDL_ADVISE=0]) -CPPFLAGS="$CPPFLAGS" +CPPFLAGS="$CPPFLAGS_save" # Arrgh. This is gross. But I can't think of any other way to do # it. :-(
Re: [OMPI devel] Maximum Shared Memory Segment - OK to increase?
Rolf, Would it be better to put this parameter in the system configuration file, rather than change the compile time option ? Rich On 8/27/07 3:10 PM, "Rolf vandeVaart" wrote: > We are running into a problem when running on one of our larger SMPs > using the latest Open MPI v1.2 branch. We are trying to run a job > with np=128 within a single node. We are seeing the following error: > > "SM failed to send message due to shortage of shared memory." > > We then increased the allowable maximum size of the shared segment to > 2Gigabytes-1 which is the maximum allowed on 32-bit application. We > used the mca parameter to increase it as shown here. > > -mca mpool_sm_max_size 2147483647 > > This allowed the program to run to completion. Therefore, we would > like to increase the default maximum from 512Mbytes to 2G-1 Gigabytes. > Does anyone have an objection to this change? Soon we are going to > have larger CPU counts and would like to increase the odds that things > work "out of the box" on these large SMPs. > > On a side note, I did a quick comparison of the shared memory needs of > the old Sun ClusterTools to Open MPI and came up with this table. > > Open MPI > np Sun ClusterTools 6current suggested > - > 2 20M 128M128M > 4 20M 128M128M > 8 22M 256M256M > 16 27M 512M512M > 32 48M 512M 1G > 64133M 512M2G-1 > 128476M 512M2G-1 > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] Maximum Shared Memory Segment - OK to increase?
We are running into a problem when running on one of our larger SMPs using the latest Open MPI v1.2 branch. We are trying to run a job with np=128 within a single node. We are seeing the following error: "SM failed to send message due to shortage of shared memory." We then increased the allowable maximum size of the shared segment to 2Gigabytes-1 which is the maximum allowed on 32-bit application. We used the mca parameter to increase it as shown here. -mca mpool_sm_max_size 2147483647 This allowed the program to run to completion. Therefore, we would like to increase the default maximum from 512Mbytes to 2G-1 Gigabytes. Does anyone have an objection to this change? Soon we are going to have larger CPU counts and would like to increase the odds that things work "out of the box" on these large SMPs. On a side note, I did a quick comparison of the shared memory needs of the old Sun ClusterTools to Open MPI and came up with this table. Open MPI np Sun ClusterTools 6current suggested - 2 20M 128M128M 4 20M 128M128M 8 22M 256M256M 16 27M 512M512M 32 48M 512M 1G 64133M 512M2G-1 128476M 512M2G-1
[OMPI devel] thread model
Hi, Until now I haven't had to worry about the opal/orte thread model. However, there are now people who would like to use ompi that has been configured with --with-threads=posix and --with-enable-mpi- threads. Can someone give me some pointers as to what I need to do in order to make sure I don't violate any threading model? The interfaces I'm calling are: opal_event_loop() opal_path_findv() orte_init() orte_ns.create_process_name() orte_iof.iof_subscribe() orte_iof.iof_unsubscribe() orte_schema.get_job_segment_name() orte_gpr.get() orte_dss.get() orte_rml.send_buffer() orte_rmgr.spawn_job() orte_pls.terminate_job() orte_rds.query() orte_smr.job_stage_gate_subscribe() orte_rmgr.get_vpid_range() Thanks, Greg
Re: [OMPI devel] MTT Database and Reporter Upgrade **Action Required**
Just wanted to let everyone know that the server upgrade went well. It is currently up and running. Feel free to submit your MTT tests as usual. Cheers, Josh On Aug 24, 2007, at 1:45 PM, Jeff Squyres wrote: FYI. The MTT database will be down for a few hours on Monday morning. It'll be replaced with a much mo'better version -- [much] faster than it was before. Details below. Begin forwarded message: From: Josh Hursey Date: August 24, 2007 1:37:18 PM EDT To: General user list for the MPI Testing Tool Subject: [MTT users] MTT Database and Reporter Upgrade **Action Required** Reply-To: General user list for the MPI Testing Tool Short Version: -- The MTT development group is rolling out newly optimized web frontend and backend database. As a result we will be taking down the MTT site at IU Monday, August 27 from 8 am to Noon US eastern time. During this time you will not be able to submit data to the MTT database. Therefore you need to disable any runs that will report during this time or your client will fail with unable to connect to server messages. This change does not affect the client configurations, so MTT users do *not* need to update their clients at this time. Longer Version: --- The MTT development team has been working diligently on server side optimizations over the past few months. This work involved major changes to the database schema, web reporter, and web submit components of the server. We want to roll out the new server side optimizations on Monday, Aug. 27. Given the extensive nature of the improvements the MTT server will need to be taken down for a few hours for this upgrade to take place. We are planning on taking down the MTT server at 8 am and we hope to have it back by Noon US Eastern time. MTT users that would normally submit results during this time range will need to disable their runs, or they will see server error messages during this outage. This upgrade does not require any client changes, so outside of the down time contributors need not change or upgrade their MTT installations. Below are a few rough performance numbers illustrating the difference between the old and new server versions as seen by the reporter. Summary report: 24 hours, all orgs 87 sec - old version 6 sec - new version Summary report: 24 hours, org = 'iu' 37 sec - old 4 sec - new Summary report: Past 3 days, all orgs 138 sec - old 9 sec - new Summary report: Past 3 days, org = 'iu' 49 sec - old 11 sec - new Summary report: Past 2 weeks, all orgs 863 sec - old 34 sec - new Summary report: Past 2 weeks, org = 'iu' 878 sec - old 12 sec - new Summary report: Past 1 month, all org 1395 sec - old 158 sec - new Summary report: Past 1 month, org = 'iu' 1069 sec - old 39 sec - new Summary report: (2007-06-18 - 2007-06-19), all org 484 sec - old 5 sec - new Summary report: (2007-06-18 - 2007-06-19), org = 'iu' 479 sec - old 2 sec - new ___ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Trunk issue?
Yes, if you're using --disable-dlopen, then libltdlc should not be linked in (because it [rightfully] won't exist). I can reproduce the problem on my MBP. Brian -- did something change here recently? On Aug 27, 2007, at 9:23 AM, Ralph H Castain wrote: Yo folks Just checked out a fresh copy of the trunk and tried to build it using my usual configure: ./configure --prefix=/Users/rhc/openmpi --with-devel-headers --disable-shared --enable-static --disable-mpi-f77 --disable-mpi-f90 --enable-mem-debug --without-memory-manager --enable-debug --disable-progress-threads --disable-mpi-threads --disable-io-romio --without-threads --disable-dlopen Got this error: Making all in mca/timer/darwin make[2]: Nothing to be done for `all'. Making all in . make[2]: *** No rule to make target `../opal/libltdl/libltdlc.la', needed by `libopen-pal.la'. Stop. make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 It looks like some change may have broken one of these options? Ralph ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer
Ralph, Ralph H Castain wrote: Just returned from vacation...sorry for delayed response No Problem. Hope you had a good vacation :) And sorry for my super delayed response. I have been pondering this a bit. In the past, I have expressed three concerns about the RSL. My bottom line recommendation: I have no philosophical issue with the RSL concept. However, I recommend holding off until the next version of ORTE is completed and then re-evaluating to see how valuable the RSL might be, as that next version will include memory footprint reduction and framework consolidation that may yield much of the RSL's value without the extra work. Long version: 1. What problem are we really trying to solve? If the RSL is intended to solve the Cray support problem (where the Cray OS really just wants to see OMPI, not ORTE), then it may have some value. The issue to date has revolved around the difficulty of maintaining the Cray port in the face of changes to ORTE - as new frameworks are added, special components for Cray also need to be created to provide a "do-nothing" capability. In addition, the Cray is memory constrained, and the ORTE library occupies considerable space while providing very little functionality. This is definitely a motivation, but not the only one. The degree of value provide by the RSL will therefore depend somewhat on the efficacy of the changes in development within ORTE. Those changes will, among other things, significantly consolidate and reduce the number of frameworks, and reduce the memory footprint. The expectation is that the result will require only a single CNOS component in one framework. It isn't clear, therefore, that the RSL will provide a significant value in that environment. But won't there still be a lot of orte code linked in that will never be used? Also, a RSL would simplify ORTE in that there would be no need to do anything special for CNOs in it. If the RSL is intended to aid in ORTE development, as hinted at in the RFC, then I believe that is questionable. Developing ORTE in a tmp branch has proven reasonably effective as changes to the MPI layer are largely invisible to ORTE. Creating another layer to the system that would also have to be maintained seems like a non-productive way of addressing any problems in that area. Whether or not it would help in orte development remains to be seen. I just say that it might. Although I would argue that developing in tmp branches has caused a lot of problems with merging, etc. If the RSL is intended as a means of "freezing" the MPI-RTE interface, then I believe we could better attain that objective by simply defining a set of requirements for the RTE. As I'll note below, freezing the interface at an API level could negatively impact other Open MPI objectives. It is intended to easily allow the development and use of other runtime systems, so simply defining requirements is not enough. 2. Who is going to maintain old RTE versions, and why? It isn't clear to me why anyone would want to do this - are we seriously proposing that we maintain support for the ORTE layer that shipped with Open MPI 1.0?? Can someone explain why we would want to do that? I highly doubt anyone would, and see no reason to include support for older runtime versions. Again, the purpose is to be able to run different runtimes. The ability to run different versions of the same runtime is just a side-effect. 3. Are we constraining ourselves from further improvements in startup performance? This is my biggest area of concern. The RSL has been proposed as an API-level definition. However, the MPI-RTE interaction really is defined in terms of a flow-of-control - although each point of interaction is instantiated as an API, the fact is that what happens at that point is not independent of all prior interactions. As an example of my concern, consider what we are currently doing with ORTE. The latest change in requirements involves the need to significantly improve startup time, reduce memory footprint, and reduce ORTE complexity. What we are doing to meet that requirement is to review the delineation of responsibilities between the MPI and RTE layers. The current delineation evolved over time, with many of the decisions made at a very early point in the program. For example, we instituted RTE-level stage gates in the MPI layer because, at the time they were needed, the MPI developers didn't want to deal with them on their side (e.g., ensuring that failure of one proc wouldn't hang the system). Given today's level of maturity in the MPI layer, we are now planning on moving the stage gates to the MPI layer, implemented as an "all-to-all" - this will remove several thousand lines of code from ORTE and make it easier for the MPI layer to operate on non-ORTE environments. Similar efforts are underway to reduce ORTE involvement in the modex operation and other parts of the MPI application lifecycle. We are able to do these things because we are now
[OMPI devel] Trunk issue?
Yo folks Just checked out a fresh copy of the trunk and tried to build it using my usual configure: ./configure --prefix=/Users/rhc/openmpi --with-devel-headers --disable-shared --enable-static --disable-mpi-f77 --disable-mpi-f90 --enable-mem-debug --without-memory-manager --enable-debug --disable-progress-threads --disable-mpi-threads --disable-io-romio --without-threads --disable-dlopen Got this error: Making all in mca/timer/darwin make[2]: Nothing to be done for `all'. Making all in . make[2]: *** No rule to make target `../opal/libltdl/libltdlc.la', needed by `libopen-pal.la'. Stop. make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 It looks like some change may have broken one of these options? Ralph
Re: [OMPI devel] Minor bug: sattach gives bad advice
Am Montag, den 27.08.2007, 08:07 -0400 schrieb Jeff Squyres: > Did you mean to send this to the SLURM list? > :-) Yes, I did. Sorry! It's one of those days... :-/ Best regards Manuel
Re: [OMPI devel] Minor bug: sattach gives bad advice
Did you mean to send this to the SLURM list? :-) On Aug 27, 2007, at 4:46 AM, Manuel Prinz wrote: Hi everyone, I noticed a very minor issue with sattach: If you pass an option it doesn't understand, it asks you to look at "sbatch --help" which is a little confusing: $ sattach -X sattach: invalid option -- X Try "sbatch --help" for more information I didn't find the right place in the source to provide a patch, sorry! (And I hope this is the right list for bugs.) Best regards Manuel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
[OMPI devel] Minor bug: sattach gives bad advice
Hi everyone, I noticed a very minor issue with sattach: If you pass an option it doesn't understand, it asks you to look at "sbatch --help" which is a little confusing: $ sattach -X sattach: invalid option -- X Try "sbatch --help" for more information I didn't find the right place in the source to provide a patch, sorry! (And I hope this is the right list for bugs.) Best regards Manuel