On Wed, Oct 07, 2020 at 14:54:30 -0400, Steve Ryan wrote:
>     I'm trying to debug an issue we've been having in our amanda
> 3.5.1 setup. Currently backups are failing every night due to (I
> believe) the driver faulting. Relevant logs:
> 
> amdump mail report:
> FAILURE DUMP SUMMARY:
>   chunker: FATAL Broken pipe at
> /usr/lib64/perl5/vendor_perl/Amanda/IPC/LineProtocol.pm line 429.
>   chunker: FATAL Connection reset by peer at
> /usr/lib64/perl5/vendor_perl/Amanda/IPC/LineProtocol.pm line 579.
> 
> dmesg:
> 2020-10-07T01:06:08.770127-04:00 vacuum.cs.umd.edu kernel: traps:
> driver[25995] general protection ip:7f2a9ffe50ec sp:7ffc61f8b040
> error:0 in libamanda-3.5.1.so[7f2a9ffaa000+81000]
> 
> 
> The environment is about ~80ish nodes total, running mostly RHEL7
> with some RHEL8 and ~3-5 Ubuntu/Debian machines. Everything is
> running 3.5.1. straight from the official sources. I don't think
> it's being caused by a client machine anyway, and some machines get
> backed up each night.

I don't remember seeing this particular problem reported here before and
don't have any silver bullet...

Which distribution is the Amanda server running on?

Was this setup of Amanda-server-and-~80ish-clients ever working
properly at some point before this crashing started??


> Has anyone seen this issue before/know what debug info I should be
> looking for in the logs?

If the driver proceess is indeed core dumping, you should see evidence
of that in /var/log/amanda/server/<CONFIG>/driver.<DATESTAMP>.debug for
that run.  At the very least the log should end abruptly; if you are
lucky there you might find a stack trace or something givening a clue as
to what is happening just before the crash.

If you can go back through the runs from various nights and correlate
the crashes to e.g. a particular client kicking off just beforehand, or
something, that might be a useful clue.


You can also look at the chunker.<DATESTAMP>.debug files in that same
directory to see if they give any additional hits, but off hand I'd
guess that they are just going to report that the chunker processes are
aborting due to the fact that the far side of the socket/pipe
disappeared, which presumably is caused by the driver process
crashing....

                                                Nathan

----------------------------------------------------------------------------
Nathan Stratton Treadway  -  [email protected]  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239

Reply via email to