If your server or client is using the UFW firewall, verify that a recent package update didn't remove some of your accept rules. I had this happen on one of my Ubuntu 14 clients just recently, but it broke SSH rather than amanda. I'd double-check (lsmod) that the *_conntrack_amanda module is still present on both client and server. You don't say if your configuration is using the SSH transport. If it is, go straight back there and check UFW.
Finally, if your client and server are separated by a firewall of any type, verify with its manager that its NAT/PAT rules aren't setup to discard/timeout connections too quickly. Ours had to have a special rule added so that it wouldn't kill the client connection back to the Amanda server while the client was busy doing its estimates, or compressing the data to send back. (But, the symptom here is that amcheck works, but the actual amdump fails.) I've seen the 'one client works but multiple clients fail' issue happen when the same firewall's NAT/PAT tables were filled; some clients would randomly fail because their traffic was getting dropped. And as I had a 'top' window open on that client, my own connection to it was keeping the firewall's NAT rule from timing out. Damn that Schrodinger's Cat. > -----Original Message----- > From: [email protected] [mailto:[email protected]] > On Behalf Of Bill Carlson > Sent: Wednesday, March 2, 2016 02:15 PM > To: [email protected] > Subject: Single client fail with error sending REQ > > Hello, > > I've run into a situation with my multi-year amanda setup similar to > problem described here: > http://archives.zmanda.com/amanda- > archives/viewtopic.php?t=7223&postdays=0&postorder=asc&highlight=error+sen > ding+req+write+error&start=0 > > A working setup for years, suddenly one specific client starts to fail > amcheck. However, amcheck on single client works fine. Same for amdump; > all clients fails, run against single problem client works fine. > > Later, noted that if one, working host is down, then problem host works > fine, amcheck and amdump. > > My best guess on trigger for this system is updates applied. This is an > Ubuntu 14.04 system, the amanda packages weren't updated but many > libraries were. > > Any pointers on what to chase here? > > Versions: > > Server: 3.3.1-4, Debian 7.9, 32-bit (possible that matters?) > > Problem clients: 3.3.3-2, Ubuntu 14.04 > > > debug from amcheck -c daily on problem host: > > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: pid 21287 ruid 34 euid > 34 version 3.3.3: start at Mon Feb 22 08:45:04 2016 Mon Feb 22 08:45:04 > 2016: thd-0xed6600: amandad: > security_getdriver(name=bsdtcp) returns 0x7f5a725c99c0 Mon Feb 22 08:45:04 > 2016: thd-0xed6600: amandad: version 3.3.3 > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: build: > VERSION="Amanda-3.3.3" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: BUILT_DATE="Tue Jan 7 > 21:16:20 UTC 2014" BUILT_MACH="" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: BUILT_REV="5099" > BUILT_BRANCH="community_3_3_3" CC="gcc" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: paths: > bindir="/usr/sbin" sbindir="/usr/sbin" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > libexecdir="/usr/lib/amanda" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > amlibexecdir="/usr/lib/amanda" mandir="/usr/share/man" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > AMANDA_TMPDIR="/tmp/amanda" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > AMANDA_DBGDIR="/var/log/amanda" CONFIG_DIR="/etc/amanda" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: DEV_PREFIX="/dev/" > RDEV_PREFIX="/dev/r" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: DUMP="/sbin/dump" > RESTORE="/sbin/restore" VDUMP=UNDEF > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: VRESTORE=UNDEF > XFSDUMP="/sbin/xfsdump" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > XFSRESTORE="/sbin/xfsrestore" VXDUMP=UNDEF VXRESTORE=UNDEF Mon Feb 22 > 08:45:04 2016: thd-0xed6600: amandad: > SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/tar" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: LPRCMD=UNDEF > MAILER=UNDEF Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > listed_incr_dir="/var/lib/amanda/gnutar-lists" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: defs: > DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > DEFAULT_TAPE_SERVER="localhost" DEFAULT_TAPE_DEVICE="" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: NEED_STRSTR AMFLOCK_POSIX > AMFLOCK_FLOCK AMFLOCK_LOCKF Mon Feb 22 08:45:04 2016: thd-0xed6600: > amandad: AMFLOCK_LNLOCK SETPGRP_VOID AMANDA_DEBUG_DAYS=4 BSD_SECURITY Mon > Feb 22 08:45:04 2016: thd-0xed6600: amandad: USE_AMANDAHOSTS > CLIENT_LOGIN="backup" CHECK_USERID HAVE_GZIP Mon Feb 22 08:45:04 2016: > thd-0xed6600: amandad: COMPRESS_SUFFIX=".gz" > COMPRESS_FAST_OPT="--fast" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc" > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: > security_handleinit(handle=0xee18f0, driver=0x7f5a725c99c0 (BSDTCP)) Mon > Feb 22 08:45:04 2016: thd-0xed6600: amandad: > security_streaminit(stream=0xee1ad0, driver=0x7f5a725c99c0 (BSDTCP)) Mon > Feb 22 08:45:04 2016: thd-0xed6600: amandad: authenticated peer name is > 'croaker.c.c' > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: accept recv REQ pkt: > <<<<< > SERVICE noop > OPTIONS features=ffffffff9efefbffffffffff1f; > >>>>> > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: creating new service: > noop OPTIONS features=ffffffff9efefbffffffffff1f; > > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: sending ACK pkt: > <<<<< > >>>>> > Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: tcpm_send_token: data is > still flowing Mon Feb 22 08:45:05 2016: thd-0xed6600: amandad: sending REP > pkt: > <<<<< > OPTIONS features=ffffffff9efefbffffffffff1f; > >>>>> > Mon Feb 22 08:45:15 2016: thd-0xed6600: amandad: timeout Mon Feb 22 > 08:45:15 2016: thd-0xed6600: amandad: sending REP pkt: > <<<<< > OPTIONS features=ffffffff9efefbffffffffff1f; > >>>>> > Mon Feb 22 08:45:15 2016: thd-0xed6600: amandad: tcpm_send_token: data is > still flowing Mon Feb 22 08:45:25 2016: thd-0xed6600: amandad: timeout Mon > Feb 22 08:45:25 2016: thd-0xed6600: amandad: sending REP pkt: > <<<<< > OPTIONS features=ffffffff9efefbffffffffff1f; > >>>>> > Mon Feb 22 08:45:35 2016: thd-0xed6600: amandad: timeout Mon Feb 22 > 08:45:35 2016: thd-0xed6600: amandad: sending REP pkt: > <<<<< > OPTIONS features=ffffffff9efefbffffffffff1f; > >>>>> > Mon Feb 22 08:45:35 2016: thd-0xed6600: amandad: tcpm_send_token: data is > still flowing Mon Feb 22 08:45:45 2016: thd-0xed6600: amandad: timeout Mon > Feb 22 08:45:45 2016: thd-0xed6600: amandad: sending REP pkt: > <<<<< > OPTIONS features=ffffffff9efefbffffffffff1f; > >>>>> > Mon Feb 22 08:45:55 2016: thd-0xed6600: amandad: timeout Mon Feb 22 > 08:45:55 2016: thd-0xed6600: amandad: timeout waiting for ACK for our REP > Mon Feb 22 08:45:55 2016: thd-0xed6600: amandad: > security_close(handle=0xee18f0, driver=0x7f5a725c99c0 (BSDTCP)) Mon Feb 22 > 08:45:55 2016: thd-0xed6600: amandad: > security_stream_close(0xee1ad0) > Mon Feb 22 08:45:55 2016: thd-0xed6600: amandad: tcpm_send_token: data is > still flowing Mon Feb 22 08:46:04 2016: thd-0xed6600: amandad: timeout > exit Mon Feb 22 08:46:04 2016: thd-0xed6600: amandad: pid 21287 finish > time Mon Feb 22 08:46:04 2016 > > > Thanks, > > -- > > Bill Carlson > > Anything is possible, given Time and Money.
