This apparently got lost in the shuffle a few months ago. The fix allows one to kick off all of their MPI Installs and Test Builds in parallel. Give it a try when you have a chance.
-Ethan > On Sat, Nov/07/2009 04:15:42PM, Jeff Squyres wrote: > > On Nov 6, 2009, at 5:18 PM, Ethan Mallove wrote: > > > >> I'm running multiple MTT clients out of the same scratch directory > >> using SGE. I'm running into race conditions between the multiple > >> clients, where one client is overwriting another's data in the .dump > >> files - which is a Very Bad Thing(tm). I'm running the > >> client/mtt-lock-server, and I've added the corresponding [Lock] > >> section in my INI file. Will my MTT clients now not interfere with > >> each other's .dump files? I'm skeptical of this because I don't see, > >> e.g., Lock() calls in SaveRuns(). How do I make my .dump files safe? > >> > > > > > > Err... perhaps this part wasn't tested well...? > > > > I'm afraid it's been forever since I've looked at this code and I'm gearing > > up to leave for the Forum on Tuesday and then staying on for SC09, so it's > > quite likely that you'll be able to look at this in more detail before I > > will. Sorry to pass the buck; just trying to be realistic... :-( > > After some digging, I discover that MTT is not designed to execute > multiple INI sections out of a single scratch directory in parallel. > There's a ticket for this: > > https://svn.open-mpi.org/trac/mtt/ticket/167 > > The way around this limitation is to have MTT split up the .dump files > by INI section so that two MTT client running simultaneously never > conflict with each other. (This change did not need to be made for the > Test run .dump files, as MTT already splits them up.) I have attached > a patch, which makes a simple wrapper script for #167 possible. The > changes should not disrupt normal (non-parallel) execution. Anyone > care to give it a try? > > -Ethan > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > > > _______________________________________________ > > mtt-users mailing list > > mtt-us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users
diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/MPI.pm --- a/lib/MTT/MPI.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/MPI.pm Wed Nov 18 11:07:37 2009 -0500 @@ -16,6 +16,8 @@ use strict; use MTT::Files; +use MTT::Messages; +use MTT::Util; #-------------------------------------------------------------------------- @@ -28,10 +30,13 @@ #-------------------------------------------------------------------------- # Filename where list of MPI sources is kept -my $sources_data_filename = "mpi_sources.dump"; +my $sources_data_filename = "mpi_sources"; # Filename where list of MPI installs is kept -my $installs_data_filename = "mpi_installs.dump"; +my $installs_data_filename = "mpi_installs"; + +# Filename extension for all the Dumper data files +my $data_filename_extension = "dump"; #-------------------------------------------------------------------------- @@ -42,10 +47,15 @@ # Explicitly delete anything that was there $MTT::MPI::sources = undef; - # If the file exists, read it in - my $data; - MTT::Files::load_dumpfile("$dir/$sources_data_filename", \$data); - $MTT::MPI::sources = $data->{VAR1}; + my @dumpfiles = glob("$dir/$sources_data_filename-*.$data_filename_extension"); + foreach my $dumpfile (@dumpfiles) { + + # If the file exists, read it in + my $data; + MTT::Files::load_dumpfile($dumpfile, \$data); + $MTT::MPI::sources = MTT::Util::merge_hashes($MTT::MPI::sources, $data->{VAR1}); + + } # Rebuild the refcounts foreach my $get_key (keys(%{$MTT::MPI::sources})) { @@ -62,9 +72,14 @@ #-------------------------------------------------------------------------- sub SaveSources { - my ($dir) = @_; + my ($dir, $name) = @_; - MTT::Files::save_dumpfile("$dir/$sources_data_filename", + # We write the entire MPI::sources hash to file, even + # though the filename indicates a single INI section + # MTT::Util::hashes_merge will take care of duplicate + # hash keys. The reason for splitting up the .dump files + # is to keep them read and write safe across INI sections + MTT::Files::save_dumpfile("$dir/$sources_data_filename-$name.$data_filename_extension", $MTT::MPI::sources); } @@ -76,10 +91,14 @@ # Explicitly delete anything that was there $MTT::MPI::installs = undef; - # If the file exists, read it in - my $data; - MTT::Files::load_dumpfile("$dir/$installs_data_filename", \$data); - $MTT::MPI::installs = $data->{VAR1}; + my @dumpfiles = glob("$dir/$installs_data_filename-*.$data_filename_extension"); + foreach my $dumpfile (@dumpfiles) { + + # If the file exists, read it in + my $data; + MTT::Files::load_dumpfile($dumpfile, \$data); + $MTT::MPI::installs = MTT::Util::merge_hashes($MTT::MPI::installs, $data->{VAR1}); + } # Rebuild the refcounts foreach my $get_key (keys(%{$MTT::MPI::installs})) { @@ -106,9 +125,14 @@ #-------------------------------------------------------------------------- sub SaveInstalls { - my ($dir) = @_; + my ($dir, $name) = @_; - MTT::Files::save_dumpfile("$dir/$installs_data_filename", + # We write the entire MPI::installs hash to file, even + # though the filename indicates a single INI section. + # MTT::Util::hashes_merge will take care of duplicate + # hash keys. The reason for splitting up the .dump files + # is to keep them read and write safe across INI sections + MTT::Files::save_dumpfile("$dir/$installs_data_filename-$name.$data_filename_extension", $MTT::MPI::installs); } diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/MPI/Get.pm --- a/lib/MTT/MPI/Get.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/MPI/Get.pm Wed Nov 18 11:07:37 2009 -0500 @@ -187,7 +187,7 @@ $MTT::MPI::sources->{$simple_section}->{$ret->{version}} = $ret; # Save the data file recording all the sources - MTT::MPI::SaveSources($source_dir); + MTT::MPI::SaveSources($source_dir, $MTT::Globals::Internals->{mpi_get_name}); } else { Verbose(" No new MPI sources\n"); } diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/MPI/Install.pm --- a/lib/MTT/MPI/Install.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/MPI/Install.pm Wed Nov 18 11:07:37 2009 -0500 @@ -794,7 +794,9 @@ $ret->{$k} = $serials->{$module}->{$k}; } - MTT::MPI::SaveInstalls($install_base); + MTT::MPI::SaveInstalls($install_base, + $MTT::Globals::Internals->{mpi_get_name} . "." . + $MTT::Globals::Internals->{mpi_install_name}); # Successful build? if (MTT::Values::PASS == $ret->{test_result}) { diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/Test.pm --- a/lib/MTT/Test.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/Test.pm Wed Nov 18 11:07:37 2009 -0500 @@ -17,6 +17,7 @@ use MTT::Files; use MTT::Messages; use MTT::DoCommand; +use MTT::Util; use Data::Dumper; #-------------------------------------------------------------------------- @@ -33,17 +34,20 @@ #-------------------------------------------------------------------------- +# Filename extension for all the Dumper data files +my $data_filename_extension = "dump"; + # Filename where list of test sources information is kept -my $sources_data_filename = "test_sources.dump"; +my $sources_data_filename = "test_sources"; # Filename where list of test build information is kept -my $builds_data_filename = "test_builds.dump"; +my $builds_data_filename = "test_builds"; # Subdir where test runs are kept my $runs_subdir = "test_runs"; # Filename where list of test run information is kept -my $runs_data_filename = "test_runs.dump"; +my $runs_data_filename = "test_runs.$data_filename_extension"; # Helper variable for when we're loading test run data my $load_run_file_start_dir; @@ -56,11 +60,14 @@ # Explicitly delete anything that was there $MTT::Test::sources = undef; - # If the file exists, read it in - my $data; - MTT::Files::load_dumpfile("$dir/$sources_data_filename", \$data); - $MTT::Test::sources = $data->{VAR1}; + my @dumpfiles = glob("$dir/$sources_data_filename-*.$data_filename_extension"); + foreach my $dumpfile (@dumpfiles) { + # If the file exists, read it in + my $data; + MTT::Files::load_dumpfile($dumpfile, \$data); + $MTT::Test::sources = MTT::Util::merge_hashes($MTT::Test::sources, $data->{VAR1}); + } # Rebuild the refcounts foreach my $test_key (keys(%{$MTT::Test::sources})) { @@ -74,9 +81,14 @@ #-------------------------------------------------------------------------- sub SaveSources { - my ($dir) = @_; + my ($dir, $name) = @_; - MTT::Files::save_dumpfile("$dir/$sources_data_filename", + # We write the entire Test::sources hash to file, even + # though the filename indicates a single INI section + # MTT::Util::hashes_merge will take care of duplicate + # hash keys. The reason for splitting up the .dump files + # is to keep them read and write safe across INI sections + MTT::Files::save_dumpfile("$dir/$sources_data_filename-$name.$data_filename_extension", $MTT::Test::sources); } @@ -88,10 +100,14 @@ # Explicitly delete anything that was there $MTT::Test::builds = undef; - # If the file exists, read it in - my $data; - MTT::Files::load_dumpfile("$dir/$builds_data_filename", \$data); - $MTT::Test::builds = $data->{VAR1}; + my @dumpfiles = glob("$dir/$builds_data_filename-*.$data_filename_extension"); + foreach my $dumpfile (@dumpfiles) { + + # If the file exists, read it in + my $data; + MTT::Files::load_dumpfile($dumpfile, \$data); + $MTT::Test::builds = MTT::Util::merge_hashes($MTT::Test::builds, $data->{VAR1}); + } # Rebuild the refcounts foreach my $get_key (keys(%{$MTT::Test::builds})) { @@ -129,9 +145,14 @@ #-------------------------------------------------------------------------- sub SaveBuilds { - my ($dir) = @_; + my ($dir, $name) = @_; - MTT::Files::save_dumpfile("$dir/$builds_data_filename", + # We write the entire Test::builds hash to file, even + # though the filename indicates a single INI section + # MTT::Util::hashes_merge will take care of duplicate + # hash keys. The reason for splitting up the .dump files + # is to keep them read and write safe across INI sections + MTT::Files::save_dumpfile("$dir/$builds_data_filename-$name.$data_filename_extension", $MTT::Test::builds); } diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/Test/Build.pm --- a/lib/MTT/Test/Build.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/Test/Build.pm Wed Nov 18 11:07:37 2009 -0500 @@ -552,10 +552,15 @@ # memory... delete $ret->{result_stdout}; delete $ret->{result_stderr}; - + # Save it $MTT::Test::builds->{$mpi_install->{mpi_get_simple_section_name}}->{$mpi_install->{mpi_version}}->{$mpi_install->{simple_section_name}}->{$simple_section} = $ret; - MTT::Test::SaveBuilds($build_base); + MTT::Test::SaveBuilds($build_base, + $MTT::Globals::Internals->{mpi_get_name} . "." . + $MTT::Globals::Internals->{mpi_install_name} . "." . + $MTT::Globals::Internals->{test_get_name} . "." . + $MTT::Globals::Internals->{test_build_name} + ); # Print if (MTT::Values::PASS == $ret->{test_result}) { diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/Test/Get.pm --- a/lib/MTT/Test/Get.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/Test/Get.pm Wed Nov 18 11:07:37 2009 -0500 @@ -129,7 +129,7 @@ $MTT::Test::sources->{$simple_section} = $ret; # Save the data file recording all the sources - MTT::Test::SaveSources($source_dir); + MTT::Test::SaveSources($source_dir, $MTT::Globals::Internals->{test_get_name}); } else { Verbose(" No new test sources\n"); } diff -r 8760c5d19838 -r 8a8663cb0ac3 lib/MTT/Util.pm --- a/lib/MTT/Util.pm Mon Nov 09 14:38:09 2009 -0500 +++ b/lib/MTT/Util.pm Wed Nov 18 11:07:37 2009 -0500 @@ -27,6 +27,7 @@ delete_matches_from_array parse_time_to_seconds get_array_ref + merge_hashes ); use MTT::Globals; @@ -388,4 +389,20 @@ } } +# Merge two hashes while preserving their key=>value structure +sub merge_hashes { + my ($x, $y) = @_; + + no strict; + foreach my $k (keys %$y) { + if (!defined($x->{$k})) { + $x->{$k} = $y->{$k}; + } else { + $x->{$k} = merge_hashes($x->{$k}, $y->{$k}); + } + } + use strict; + return $x; +} + 1;
_______________________________________________ mtt-users mailing list mtt-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users