On Tue, Nov 21, 2017 at 11:45:52 -0500, Jean-Louis Martineau wrote:
> There is no patch attached on your email.
Ah, sorry, attached here.
>
> I do not see how Amanda::Util::setup_application call code in holding.c
To be precise what I actually saw was activity involving the "pid" files
in the holding directory showing up in an strace log. I am not sure how
it got called, but assume it was coming from holding.c because it had
the arguments for the kill syscall in the correct order (and thus I
believe the activity couldn't have been from the then-in-place
Holding.pm code). I will investigate further and let you know what I
find.
(In any case, I am sure that the pid read from the holding-directory pid
file during the get_all_datestamps() call was the parent pid of the
current process [and of course the file had not contained that value
before amflush was invoked].)
Nathan
----------------------------------------------------------------------------
Nathan Stratton Treadway - [email protected] - Mid-Atlantic region
Ray Ontko & Co. - Software consulting services - http://www.ontko.com/
GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt ID: 1023D/ECFB6239
Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239
--- Holding.pm.orig 2017-11-20 16:51:32.000000000 -0500
+++ Holding.pm 2017-11-21 02:10:47.992749021 -0500
@@ -25,6 +25,7 @@
use File::stat;
use IO::Dir;
use POSIX qw( :fcntl_h );
+use Errno;
use Math::BigInt;
use strict;
use warnings;
@@ -198,6 +199,36 @@
return 1;
}
+# check to see if the given pid identifies a process that could
+# validily be holding the Amanda directory lock (thus preventing
+# us from taking it). (See also holding.c:can_take_holding() for
+# related logic..)
+sub _valid_locking_pid {
+ my $pid_to_check = shift;
+
+ if ( $pid_to_check == $$ || $pid_to_check == getppid() ) {
+ # this process or the parent already hold the lock, so we
+ # can go ahead and use the directory.
+ return 0;
+ }
+
+ if (kill(0, $pid_to_check) == 0 ) {
+ if ($! == Errno::EPERM) {
+ # the process exists (but we don't have permission to kill it)
+ return 1;
+ }
+ else {
+ # (if we reach here, presumably the errno was ESRCH)
+ # the process does not exist; we can use the directory
+ return 0;
+ }
+ }
+ else {
+ # kill succeeded, so process does exist
+ return 1;
+ }
+}
+
sub _walk {
my ($file_fn, $verbose, $take_pid) = @_;
@@ -233,8 +264,10 @@
if (open(my $pidh, $pidfn)) {
my $pid = <$pidh>;
close($pidh);
- if (kill($pid, 0) == 0) {
- # pid is alive, skip this directory
+
+ if ( _valid_locking_pid($pid) ) {
+ # $pid seems valid, so we skip over this directory
+ print $verbose "skipping directory '$dirfn' due to lock by pid $pid. \n" if $verbose;
next;
}
}