On Tue, Nov 21, 2017 at 11:45:52 -0500, Jean-Louis Martineau wrote: > There is no patch attached on your email.
Ah, sorry, attached here. > > I do not see how Amanda::Util::setup_application call code in holding.c To be precise what I actually saw was activity involving the "pid" files in the holding directory showing up in an strace log. I am not sure how it got called, but assume it was coming from holding.c because it had the arguments for the kill syscall in the correct order (and thus I believe the activity couldn't have been from the then-in-place Holding.pm code). I will investigate further and let you know what I find. (In any case, I am sure that the pid read from the holding-directory pid file during the get_all_datestamps() call was the parent pid of the current process [and of course the file had not contained that value before amflush was invoked].) Nathan ---------------------------------------------------------------------------- Nathan Stratton Treadway - natha...@ontko.com - Mid-Atlantic region Ray Ontko & Co. - Software consulting services - http://www.ontko.com/ GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt ID: 1023D/ECFB6239 Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239
--- Holding.pm.orig 2017-11-20 16:51:32.000000000 -0500 +++ Holding.pm 2017-11-21 02:10:47.992749021 -0500 @@ -25,6 +25,7 @@ use File::stat; use IO::Dir; use POSIX qw( :fcntl_h ); +use Errno; use Math::BigInt; use strict; use warnings; @@ -198,6 +199,36 @@ return 1; } +# check to see if the given pid identifies a process that could +# validily be holding the Amanda directory lock (thus preventing +# us from taking it). (See also holding.c:can_take_holding() for +# related logic..) +sub _valid_locking_pid { + my $pid_to_check = shift; + + if ( $pid_to_check == $$ || $pid_to_check == getppid() ) { + # this process or the parent already hold the lock, so we + # can go ahead and use the directory. + return 0; + } + + if (kill(0, $pid_to_check) == 0 ) { + if ($! == Errno::EPERM) { + # the process exists (but we don't have permission to kill it) + return 1; + } + else { + # (if we reach here, presumably the errno was ESRCH) + # the process does not exist; we can use the directory + return 0; + } + } + else { + # kill succeeded, so process does exist + return 1; + } +} + sub _walk { my ($file_fn, $verbose, $take_pid) = @_; @@ -233,8 +264,10 @@ if (open(my $pidh, $pidfn)) { my $pid = <$pidh>; close($pidh); - if (kill($pid, 0) == 0) { - # pid is alive, skip this directory + + if ( _valid_locking_pid($pid) ) { + # $pid seems valid, so we skip over this directory + print $verbose "skipping directory '$dirfn' due to lock by pid $pid. \n" if $verbose; next; } }