On Thu, Sep 12, 2019 at 17:30:44 +0000, Robert Reilly wrote:
> KJ, unfortunately the something amandad on the client gets defunct and
> there procs do not go away until I kill them. The procs hang ( I think
> until timeout ) and get an EOF in the log.
> 
> The fact that amandad is defunct must be what is hanging the backup, I
> have not been able to figure out why amandad is going defunct though.
> rob
> 
> sshd?????????amandad?????????2*[amandad]
>                ??????sendsize?????????2*[sendsize?????????calcsize]
> 
> root     20206  1322  0 17:18 ?        00:00:00 sshd: backup [priv]
> backup   20217     1  0 17:18 ?        00:00:00 /lib/systemd/systemd --user
> backup   20218 20217  0 17:18 ?        00:00:00 (sd-pam)
> backup   20255 20206  0 17:18 ?        00:00:00 sshd: backup@notty
> backup   20256 20255  0 17:18 ?        00:00:00 /usr/lib/amanda/amandad 
> -auth=ssh
> backup   20261 20256  0 17:18 ?        00:00:00 [amandad] <defunct>
> backup   20288 20256  0 17:18 ?        00:00:00 /usr/lib/amanda/sendsize 
> amandad ssh
> backup   20289 20256  0 17:18 ?        00:00:00 [amandad] <defunct>
> backup   20290 20288  0 17:18 ?        00:00:00 /usr/lib/amanda/sendsize 
> amandad ssh
> backup   20291 20288  0 17:18 ?        00:00:00 /usr/lib/amanda/sendsize 
> amandad ssh
> backup   20292 20290  0 17:18 ?        00:00:00 calcsize DailySet1 GNUTAR / / 
> 0 0
> backup   20293 20291  0 17:18 ?        00:00:00 calcsize DailySet1 GNUTAR 
> /opt /opt 0 0 

I'm still trying to look though all the emails and linked log files in
this thread, but may not be able to do that today, so thought I would
sent a quick note for now:

I don't use ssh auth myself and it's not currently clear to me why there
would be two defunct amandad processes like this, but amandad definitely
spawns subprocesses for each "service" it's running, and I'm fairly
certain that it doesn't reap child processes until it is shutting
down....

So in this case I think the fact that there are defunct amandad
processes is probably just a harmless side-effect of the true problem,
which seems to be that the calcsize proceses aren't finishing. 

Are you still getting this same behavior?  Can you lsof/strace the
calcsize process and/or look closely in the
/var/log/amanda/client/DailySet1/sendsize.*.debug log to find out what
it causing it to hang?

(Once you've finished investigating the calcisize processes, you could
test my first theory by killing those processes and seeing if amandad at
that point cleans up the defunct processes and exits cleanly...)

                                                Nathan



----------------------------------------------------------------------------
Nathan Stratton Treadway  -  [email protected]  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Reply via email to