On Thu, Sep 12, 2019 at 17:30:44 +0000, Robert Reilly wrote:
> KJ, unfortunately the something amandad on the client gets defunct and
> there procs do not go away until I kill them. The procs hang ( I think
> until timeout ) and get an EOF in the log.
>
> The fact that amandad is defunct must be what is hanging the backup, I
> have not been able to figure out why amandad is going defunct though.
> rob
>
> sshd?????????amandad?????????2*[amandad]
> ??????sendsize?????????2*[sendsize?????????calcsize]
>
> root 20206 1322 0 17:18 ? 00:00:00 sshd: backup [priv]
> backup 20217 1 0 17:18 ? 00:00:00 /lib/systemd/systemd --user
> backup 20218 20217 0 17:18 ? 00:00:00 (sd-pam)
> backup 20255 20206 0 17:18 ? 00:00:00 sshd: backup@notty
> backup 20256 20255 0 17:18 ? 00:00:00 /usr/lib/amanda/amandad
> -auth=ssh
> backup 20261 20256 0 17:18 ? 00:00:00 [amandad] <defunct>
> backup 20288 20256 0 17:18 ? 00:00:00 /usr/lib/amanda/sendsize
> amandad ssh
> backup 20289 20256 0 17:18 ? 00:00:00 [amandad] <defunct>
> backup 20290 20288 0 17:18 ? 00:00:00 /usr/lib/amanda/sendsize
> amandad ssh
> backup 20291 20288 0 17:18 ? 00:00:00 /usr/lib/amanda/sendsize
> amandad ssh
> backup 20292 20290 0 17:18 ? 00:00:00 calcsize DailySet1 GNUTAR / /
> 0 0
> backup 20293 20291 0 17:18 ? 00:00:00 calcsize DailySet1 GNUTAR
> /opt /opt 0 0
I'm still trying to look though all the emails and linked log files in
this thread, but may not be able to do that today, so thought I would
sent a quick note for now:
I don't use ssh auth myself and it's not currently clear to me why there
would be two defunct amandad processes like this, but amandad definitely
spawns subprocesses for each "service" it's running, and I'm fairly
certain that it doesn't reap child processes until it is shutting
down....
So in this case I think the fact that there are defunct amandad
processes is probably just a harmless side-effect of the true problem,
which seems to be that the calcsize proceses aren't finishing.
Are you still getting this same behavior? Can you lsof/strace the
calcsize process and/or look closely in the
/var/log/amanda/client/DailySet1/sendsize.*.debug log to find out what
it causing it to hang?
(Once you've finished investigating the calcisize processes, you could
test my first theory by killing those processes and seeing if amandad at
that point cleans up the defunct processes and exits cleanly...)
Nathan
----------------------------------------------------------------------------
Nathan Stratton Treadway - [email protected] - Mid-Atlantic region
Ray Ontko & Co. - Software consulting services - http://www.ontko.com/
GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt ID: 1023D/ECFB6239
Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239