this commit does more harm then good. we experience following: - some child processes still running after timeout and mtt killed the job.
before this commit - it worked fine. please revert and test more. On Sat, Jun 21, 2014 at 3:30 PM, MPI Team <mpit...@crest.iu.edu> wrote: > The branch, master has been updated > via 016088f2a0831b32ab5fd6f60f4cabe67e92e594 (commit) > via 7fb4c6a4c9d71be127ea53bd463178510577f71f (commit) > via 381ba177d835a54c3197d846f5a4edfc314efe27 (commit) > via cfdd29de2012eeb7592706f00dd07a52dd48cf6b (commit) > via 940030ca20eb1eaf256e898b83866c1cb83aca5c (commit) > from c99ed7c7b159a2cab58a251bd7c0dad8972ff901 (commit) > > Those revisions listed above that are new to this repository have > not appeared on any other notification email; so we list those > revisions in full, below. > > - Log ----------------------------------------------------------------- > > https://github.com/open-mpi/mtt/commit/016088f2a0831b32ab5fd6f60f4cabe67e92e594 > > commit 016088f2a0831b32ab5fd6f60f4cabe67e92e594 > Author: Jeff Squyres <jsquy...@cisco.com> > Date: Sat Jun 21 04:58:45 2014 -0700 > > DoCommand: several fixes to kill_proc logic > > 1. Fix the kill(0, $pid) test to see if the process was still alive. > > 2. Rename _kill_proc() to _kill_proc_tree() to indicate that it's > really killing not only the PID in question, but also all of its > descendants. > > 3. In _kill_proc_tree(), change the order to kill the main PID first, > and ''then'' kill all the descendants. > > The main use case is when killing mpirun: if we kill mpirun's > descendants ''first'', mpirun will detect its childrens' deaths and > then cleanup and exit. Later, when MTT finally gets around to killing > mpirun, MTT will detect that mpirun is already dead and therefore emit > a confusing "mpirun died right at end of timeout" message. This is > misleading at best; it doesn't indicate what actually happened. > > However, if we kill mpirun first, it will take care of killing all of > its descendants. MTT will therefore emit the right messages about > killing mpirun. MTT will then redundantly try to kill a bunch of > now-nonexistent descendant processes of mpirun, but that's ok/safe. > We actually ''want'' this try-to-kill-mpirun's-descendants behavior to > handle the case when mpirun is misbehaving / not cleaning up its > descendants. > > 4. DoCommand() is used for more than launching mpirun, so pass down > $argv0 so that we can print the actual command name that is being > killed in various Verbose/Debug messages, not the hard-coded "mpirun" > string (which, in practice, was probably almost always correct, but > still...). > --- > lib/MTT/DoCommand.pm | 78 > ++++++++++++++++++++++++++++++++++++---------------- > 1 file changed, 55 insertions(+), 23 deletions(-) > > diff --git a/lib/MTT/DoCommand.pm b/lib/MTT/DoCommand.pm > index 02cdb94..646ca31 100644 > --- a/lib/MTT/DoCommand.pm > +++ b/lib/MTT/DoCommand.pm > @@ -2,7 +2,7 @@ > # > # Copyright (c) 2005-2006 The Trustees of Indiana University. > # All rights reserved. > -# Copyright (c) 2006-2013 Cisco Systems, Inc. All rights reserved. > +# Copyright (c) 2006-2014 Cisco Systems, Inc. All rights reserved. > # Copyright (c) 2007-2008 Sun Microsystems, Inc. All rights reserved. > # Copyright (c) 2007-2012 High Performance Computing Center Stuttgart, > # University of Stuttgart. All rights reserved. > @@ -57,23 +57,27 @@ sub DoCommand { > ($time_arg, $no_execute) = @_; > } > > +# This function is called for killing both mpirun and each of its > +# descendants. We really only want to see verbose output for when we > +# kill mpirun itself, so only show output when the caller provides us > +# with a $argv0 value. > sub _kill_proc_one { > - my ($pid) = @_; > + my ($pid, $argv0) = @_; > > # How long to wait after each kill() > my $wait_time = 5; > > # See if the proc is alive first > - my $kid; > - kill(0, $pid); > - $kid = waitpid($pid, WNOHANG); > - return "mpirun died right at end of timeout (MTT did not have to kill > it)" > - if (-1 == $kid); > + my $num_alive = kill(0, $pid); > + return "$argv0 died right at end of timeout (MTT did not have to kill > it)" > + if (0 == $num_alive); > > # Try an easy kill > kill("TERM", $pid); > - Verbose("*** Killing mpirun with SIGTERM\n"); > + Verbose("*** Killing $argv0 with SIGTERM\n") > + if (defined($argv0)); > # Give mpirun some time to cleanup before we try to reap it. > + my $kid; > my $i = $wait_time; > while ($i > 0) { > sleep(1); > @@ -85,45 +89,74 @@ sub _kill_proc_one { > # process no longer exists (i.e., we get -1 back from > # waitpid), or we successfully killed it (i.e., we got the PID > # back from waitpid). > - return "MTT killed mpirun via SIGTERM" if (0 != $kid); > + return "MTT killed $argv0 via SIGTERM" if (0 != $kid); > > --$i; > } > - Verbose("** Kill TERM (after $wait_time seconds) didn't work!\n"); > + Verbose("** Kill TERM (after $wait_time seconds) didn't work!\n") > + if (defined($argv0)); > > # That didn't work either. Try SIGINT; > - Verbose("*** Killing mpirun with SIGINT\n"); > + Verbose("*** Killing $argv0 with SIGINT\n") > + if (defined($argv0)); > kill("INT", $pid); > my $i = $wait_time; > while ($i > 0) { > sleep(1); > $kid = waitpid($pid, WNOHANG); > - return "MTT killed mpirun via SIGINT" if (0 != $kid); > + return "MTT killed $argv0 via SIGINT" if (0 != $kid); > --$i; > } > - Verbose("** Kill INT (after $wait_time seconds) didn't work!\n"); > + Verbose("** Kill INT (after $wait_time seconds) didn't work!\n") > + if (defined($argv0)); > > # Ok, now we're mad. Be violent. > - Verbose("*** Now I'm mad. Killing mpirun with SIGKILL\n"); > + Verbose("*** Now I'm mad. Killing $argv0 with SIGKILL\n") > + if (defined($argv0)); > my $count = 0; > while (1) { > kill("KILL", $pid); > ++$count; > $kid = waitpid($pid, WNOHANG); > - return "MTT killed mpirun via $count SIGKILLs" if (0 != $kid); > - Verbose("** Kill KILL didn't work! Sleeping and trying > again...\n"); > + return "MTT killed $argv0 via $count SIGKILLs" if (0 != $kid); > + Verbose("** Kill KILL didn't work! Sleeping and trying > again...\n") > + if (defined($argv0)); > sleep(1); > } > } > > > -sub _kill_proc { > - my ($pid) = @_; > - # kill the group, take the names later > - foreach my $process (descendant_processes($pid)) { > +sub _kill_proc_tree { > + my ($pid, $argv0) = @_; > + > + # Find all descendent processes of the main PID > + my @children = descendant_processes($pid); > + > + # Kill the main PID. Note that _kill_proc_one() will give the > + # process time to react/cleanup, so there's no need for an > + # additional delay after it returns. > + my $ret = _kill_proc_one($pid, $argv0); > + > + # Now kill all the processes that descended from the base PID. > + # > + # Note: when mpirun is working properly (which is one of the > + # biggest use cases for DoCommand), this is redundant -- all the > + # children will already be dead (because killing mpirun will > + # ensure that all descendant processes are also killed). > + # > + # That being said, a) DoCommand() is used to launch more than just > + # mpirun, b) mpirun breaks sometimes and doesn't clean up its > + # descendants, and c) it's safe to call _kill_proc_one() on a PID > + # that is already dead. > + foreach my $process (@children) { > + Debug("=== killing child proc: $process->{pid}, > $process->{argv0}\n"); > + > + # Ignore the return. For example, we don't care if the child > + # is already dead > _kill_proc_one($process->{pid}); > } > - return _kill_proc_one($pid); > + > + return $ret; > } > > > #-------------------------------------------------------------------------- > @@ -338,7 +371,6 @@ sub Cmd { > > my $tokens = _quote_escape($cmd); > > - > my $pid; > if (! $no_execute) { > > @@ -568,7 +600,7 @@ sub Cmd { > > $done = 0; > } > - $timeout_message = _kill_proc($pid); > + $timeout_message = _kill_proc_tree($pid, ${$tokens}[0]); > > # We don't care about the exit status if we timed out > # -- fill it with a bogus value. > > > https://github.com/open-mpi/mtt/commit/7fb4c6a4c9d71be127ea53bd463178510577f71f > > commit 7fb4c6a4c9d71be127ea53bd463178510577f71f > Author: Jeff Squyres <jsquy...@cisco.com> > Date: Sat Jun 21 04:42:19 2014 -0700 > > DoCommand: replace obscure perl code with easier-to-grok version > > The descendant_processes() and flatten() codes were written in a > nicely efficient yet very difficult to understand manner. This commit > replaces these two subs with admittedly less efficient but much easier > to understand code. > > The new code also saves the first token of the command for each PID so > that it can be used in Debug/Verbose statements (e.g., identify the > command that is being killed, etc.). > --- > lib/MTT/DoCommand.pm | 63 > ++++++++++++++++++++++++++++++++++++++++------------ > 1 file changed, 49 insertions(+), 14 deletions(-) > > diff --git a/lib/MTT/DoCommand.pm b/lib/MTT/DoCommand.pm > index c3be8a8..02cdb94 100644 > --- a/lib/MTT/DoCommand.pm > +++ b/lib/MTT/DoCommand.pm > @@ -120,8 +120,8 @@ sub _kill_proc_one { > sub _kill_proc { > my ($pid) = @_; > # kill the group, take the names later > - foreach my $p (descendant_processes($pid)) { > - _kill_proc_one($p); > + foreach my $process (descendant_processes($pid)) { > + _kill_proc_one($process->{pid}); > } > return _kill_proc_one($pid); > } > @@ -527,7 +527,8 @@ sub Cmd { > if (defined($timeout_backtrace_program) and > !$got_backtrace) { > $backtrace = ""; > if ( $timeout_before_backtrace_program ) { > - foreach my $p (descendant_processes($pid)) { > + foreach my $process (descendant_processes($pid)) { > + my $p = $process->{pid}; > my $c = $timeout_before_backtrace_program; > $c =~ s/%PID%/$p/g; > Debug("_pre_backtrace cmd: $c\n"); > @@ -538,7 +539,8 @@ sub Cmd { > $backtrace .= > _get_backtrace($timeout_backtrace_program, $pid, $pre_pernode); > > if ( $timeout_after_backtrace_program ) { > - foreach my $p (descendant_processes($pid)) { > + foreach my $process (descendant_processes($pid)) { > + my $p = $process->{pid}; > my $c = $timeout_after_backtrace_program; > $c =~ s/%PID%/$p/g; > Debug("_post_backtrace cmd: $c\n"); > @@ -738,19 +740,50 @@ sub _do_email_timeout_notification { > close(TIMEOUT_SENTINEL_FILE); > } > > -sub flatten { > - map{ (ref($_) eq "ARRAY") ? map{flatten($_)}@$_ : $_ } @_; > +sub find_children { > + my ($pids, $start) = @_; > + > + my @ret; > + # Scan the PID list and see if any other PIDs list $start as their > + # PPID. If so, save that PID and then recursively look for *that* > + # PID's children. > + foreach my $pid (sort(keys(%{$pids}))) { > + if ($pids->{$pid}->{ppid} == $start) { > + push(@ret, { > + pid => $pid, > + argv0 => $pids->{$pid}->{argv0}, > + }); > + > + # Add this PID's descendants to the return array > + push(@ret, find_children($pids, $pid)); > + } > + } > + > + return @ret; > } > > sub descendant_processes { > my ($base) = (@_, $$); > - my %parentage = (`ps -eo pid,ppid` =~ /\d+/g); > - my %reverse = map { ($_, [$_]) } %parentage; > - while (my ($pid,$ppid) = each %parentage){ > - push @{$reverse{$ppid}}, $reverse{$pid}; > + open(PS, "ps -eo pid,ppid,cmd|") || die "Can't open ps"; > + > + # Skip the title line > + <PS>; > + > + # Read all the data lines > + my $pids; > + while (<PS>) { > + # Grab the PID, PPID, and first token of the command > + $_ =~ m/(\d+)\s+(\d+)\s+(\S+)/; > + $pids->{$1} = { > + pid => $1, > + ppid => $2, > + argv0 => $3, > + }; > } > - shift @{$reverse{$base}}; > - flatten($reverse{$base}); > + close(PS); > + > + # Find all the descendants of the $base PID > + return find_children($pids, $base); > } > > sub _get_backtrace { > @@ -774,7 +807,8 @@ sub _get_backtrace { > > # Use ps -Af output to fetch the child pids, > # and grab a stack trace from each one > - foreach my $p (descendant_processes($pid)) { > + foreach my $process (descendant_processes($pid)) { > + my $p = $process->{pid}; > $gdb_cmd = "gdb - $p --command=$gdb_command_filename > --batch"; > $ret .= "\n $gdb_cmd"; > $ret .= "\n" . `$gdb_cmd`; > @@ -814,8 +848,9 @@ sub _get_backtrace { > } > } > Debug("Stacktrace: base name $return_basename\n"); > - #foreach my $p (descendant_processes($pid)) > + #foreach my $process (descendant_processes($pid)) > #{ > + # my $p = $process->{pid}; > # my $gstack_cmd = "gstack $p"; > # $ret .= "\n $gstack_cmd"; > # $ret .= "\n" . `$gstack_cmd`; > > > https://github.com/open-mpi/mtt/commit/381ba177d835a54c3197d846f5a4edfc314efe27 > > commit 381ba177d835a54c3197d846f5a4edfc314efe27 > Author: Jeff Squyres <jsquy...@cisco.com> > Date: Sat Jun 21 03:48:55 2014 -0700 > > DoCommand: ensure to only kill the process once > > When the DoCommand timeout fires, MTT will kill the entire process > tree of the launched command. However, the stdout/stderr sockets may > still be draining, and therefore the main DoCommand loop may still > execute more iterations. In cases like these, ensure that we don't > try to backtrace/kill the process tree again. > > Also add a few more Debug statements for detailed output. > --- > lib/MTT/DoCommand.pm | 23 ++++++++++++++++------- > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff --git a/lib/MTT/DoCommand.pm b/lib/MTT/DoCommand.pm > index 0e253f0..c3be8a8 100644 > --- a/lib/MTT/DoCommand.pm > +++ b/lib/MTT/DoCommand.pm > @@ -471,6 +471,7 @@ sub Cmd { > my $len = sysread(OUTread, $data, 99999); > if (0 == $len) { > vec($rin, fileno(OUTread), 1) = 0; > + Debug("*** Child process stdout closed\n"); > --$done; > } else { > _append($data, $print_timestamp ? localtime() : "", > @@ -485,6 +486,7 @@ sub Cmd { > my $len = sysread(ERRread, $data, 99999); > if (0 == $len) { > vec($rin, fileno(ERRread), 1) = 0; > + Debug("*** Child process stderr closed\n"); > --$done; > } else { > _append($data, $print_timestamp ? localtime() : "", > @@ -493,11 +495,19 @@ sub Cmd { > } > } > > - # If we're running with a timeout, bail if we're past the end > - # time > - if (defined($end_time) && time() > $end_time) { > + # If we're running with a timeout, check to see if a) the > + # process is still running (i.e., stdout/stderr is still > + # open), and b) we're past the end time. > + if ($done > 0 && defined($end_time) && time() > $end_time) { > my $over = time() - $end_time; > - if ($over > $last_over) { > + > + # Note that we only want to backtrace/kill the process > + # *once*. Consider: it is possible that we kill the > + # process tree, but then stdout/stderr are still draining, > + # and therefore we loop around here again *even though the > + # process tree is already dead*. So put a little > + # do-this-only-once protection in here. > + if (!defined($timeout_message) && $over > $last_over) { > Verbose("*** Past timeout of $timeout seconds by $over > seconds\n"); > > # Handle timeout file > @@ -555,10 +565,8 @@ sub Cmd { > ); > > $done = 0; > - $timeout_message = _kill_proc($pid); > - } else { > - $timeout_message = _kill_proc($pid); > } > + $timeout_message = _kill_proc($pid); > > # We don't care about the exit status if we timed out > # -- fill it with a bogus value. > @@ -576,6 +584,7 @@ sub Cmd { > } > } > } > + Debug("*** Child process now dead\n"); > close OUTerr; > close OUTread > if (!$merge_output); > > > https://github.com/open-mpi/mtt/commit/cfdd29de2012eeb7592706f00dd07a52dd48cf6b > > commit cfdd29de2012eeb7592706f00dd07a52dd48cf6b > Author: Jeff Squyres <jsquy...@cisco.com> > Date: Sat Jun 21 03:36:08 2014 -0700 > > DoCommand: remove unused variables > --- > lib/MTT/DoCommand.pm | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/lib/MTT/DoCommand.pm b/lib/MTT/DoCommand.pm > index 0740b28..0e253f0 100644 > --- a/lib/MTT/DoCommand.pm > +++ b/lib/MTT/DoCommand.pm > @@ -445,7 +445,6 @@ sub Cmd { > $end_time = time() + $timeout; > Debug("Timeout: $timeout - $end_time (vs. now: " . time() . > ")\n"); > } > - my $killed_status = undef; > my $last_over = 0; > my $partial_out; > my $partial_err; > @@ -900,7 +899,6 @@ sub Win_Cmd { > $end_time = time() + $timeout; > Debug("Timeout: $timeout - $end_time (vs. now: " . time() . > ")\n"); > } > - my $killed_status = undef; > my $last_over = 0; > > while(<OUTread>) { > > > https://github.com/open-mpi/mtt/commit/940030ca20eb1eaf256e898b83866c1cb83aca5c > > commit 940030ca20eb1eaf256e898b83866c1cb83aca5c > Author: Jeff Squyres <jsquy...@cisco.com> > Date: Fri Jun 20 17:13:08 2014 -0700 > > Diskfree.pm: fix df handle leak > > This left a zombied "df" process in the process table. The Perl > runtime eventually cleaned it up, but it was weird to see a zombied df > in ps listings. > > There's no point in reporting this bug upstream; according to the > Changes file, v0.06 was published in 1998. There's been no activity > on http://search.cpan.org/dist/Filesys-DiskFree/DiskFree.pm since > v0.06. > --- > lib/Filesys/DiskFree.pm | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/Filesys/DiskFree.pm b/lib/Filesys/DiskFree.pm > index 1fa655d..41a27d7 100644 > --- a/lib/Filesys/DiskFree.pm > +++ b/lib/Filesys/DiskFree.pm > @@ -124,8 +124,9 @@ sub df(){ > $cmd=$self->command() or > croak "No df command known for format ".$self->{'FORMAT'}; > open(HANDLE,"$cmd|") or croak("Cannot fork \"$cmd\": $!"); > - return $self->load(\*HANDLE); > + my $ret = $self->load(\*HANDLE); > close(HANDLE) or croak("Cannot df $!"); > + return $ret; > } > > sub load() { > > ----------------------------------------------------------------------- > > Summary of changes: > lib/Filesys/DiskFree.pm | 3 +- > lib/MTT/DoCommand.pm | 162 > +++++++++++++++++++++++++++++++++++------------- > 2 files changed, 120 insertions(+), 45 deletions(-) > > > hooks/post-receive > -- > MTT > _______________________________________________ > mtt-svn mailing list > mtt-...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-svn >