Haven't found a solution yet.

Using the following workaround, for those with a similar problem.

Use two instances of amcheck/amdump, one with all but problem host and other with single host, same config:

amcheck -m daily host1 host2 host4
amdump daily host1 host2 host4
...later...
amcheck -m daily host3
amdump daily host3


Might be easier to backup single host first, make for less delay for second run.

--

Bill Carlson

Anything is possible, given Time and Money.

On 03/04/2016 02:19 PM, Bill Carlson wrote:
Understood that can be a problem, but no firewall or iptables is involved. All clients and server are on the same subnet/VLAN.

--

Bill Carlson

Anything is possible, given Time and Money.

On 03/04/2016 11:20 AM, Joi L. Ellis wrote:
If your server or client is using the UFW firewall, verify that a recent package update didn't remove some of your accept rules. I had this happen on one of my Ubuntu 14 clients just recently, but it broke SSH rather than amanda. I'd double-check (lsmod) that the *_conntrack_amanda module is still present on both client and server. You don't say if your configuration is using the SSH transport. If it is, go straight back there and check UFW.

Finally, if your client and server are separated by a firewall of any type, verify with its manager that its NAT/PAT rules aren't setup to discard/timeout connections too quickly. Ours had to have a special rule added so that it wouldn't kill the client connection back to the Amanda server while the client was busy doing its estimates, or compressing the data to send back. (But, the symptom here is that amcheck works, but the actual amdump fails.)

I've seen the 'one client works but multiple clients fail' issue happen when the same firewall's NAT/PAT tables were filled; some clients would randomly fail because their traffic was getting dropped. And as I had a 'top' window open on that client, my own connection to it was keeping the firewall's NAT rule from timing out.

Damn that Schrodinger's Cat.



-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of Bill Carlson
Sent: Wednesday, March 2, 2016 02:15 PM
To: [email protected]
Subject: Single client fail with error sending REQ

Hello,

I've run into a situation with my multi-year amanda setup similar to
problem described here:
http://archives.zmanda.com/amanda-
archives/viewtopic.php?t=7223&postdays=0&postorder=asc&highlight=error+sen
ding+req+write+error&start=0

A working setup for years, suddenly one specific client starts to fail
amcheck. However, amcheck on single client works fine. Same for amdump;
all clients fails, run against single problem client works fine.

Later, noted that if one, working host is down, then problem host works
fine, amcheck and amdump.

My best guess on trigger for this system is updates applied. This is an
Ubuntu 14.04 system, the amanda packages weren't updated but many
libraries were.

Any pointers on what to chase here?

Versions:

Server: 3.3.1-4, Debian 7.9, 32-bit (possible that matters?)

Problem clients: 3.3.3-2, Ubuntu 14.04


debug from amcheck -c daily on problem host:

Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: pid 21287 ruid 34 euid
34 version 3.3.3: start at Mon Feb 22 08:45:04 2016 Mon Feb 22 08:45:04
2016: thd-0xed6600: amandad:
security_getdriver(name=bsdtcp) returns 0x7f5a725c99c0 Mon Feb 22 08:45:04
2016: thd-0xed6600: amandad: version 3.3.3
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:     build:
VERSION="Amanda-3.3.3"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: BUILT_DATE="Tue Jan 7
21:16:20 UTC 2014" BUILT_MACH=""
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: BUILT_REV="5099"
BUILT_BRANCH="community_3_3_3" CC="gcc"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:     paths:
bindir="/usr/sbin" sbindir="/usr/sbin"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
libexecdir="/usr/lib/amanda"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
amlibexecdir="/usr/lib/amanda" mandir="/usr/share/man"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
AMANDA_TMPDIR="/tmp/amanda"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
AMANDA_DBGDIR="/var/log/amanda" CONFIG_DIR="/etc/amanda"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: DEV_PREFIX="/dev/"
RDEV_PREFIX="/dev/r"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: DUMP="/sbin/dump"
RESTORE="/sbin/restore" VDUMP=UNDEF
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: VRESTORE=UNDEF
XFSDUMP="/sbin/xfsdump"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
XFSRESTORE="/sbin/xfsrestore" VXDUMP=UNDEF VXRESTORE=UNDEF Mon Feb 22
08:45:04 2016: thd-0xed6600: amandad:
SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/tar"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: LPRCMD=UNDEF
MAILER=UNDEF Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
listed_incr_dir="/var/lib/amanda/gnutar-lists"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:     defs:
DEFAULT_SERVER="localhost" DEFAULT_CONFIG="DailySet1"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
DEFAULT_TAPE_SERVER="localhost" DEFAULT_TAPE_DEVICE=""
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: NEED_STRSTR AMFLOCK_POSIX
AMFLOCK_FLOCK AMFLOCK_LOCKF Mon Feb 22 08:45:04 2016: thd-0xed6600:
amandad: AMFLOCK_LNLOCK SETPGRP_VOID AMANDA_DEBUG_DAYS=4 BSD_SECURITY Mon
Feb 22 08:45:04 2016: thd-0xed6600: amandad: USE_AMANDAHOSTS
CLIENT_LOGIN="backup" CHECK_USERID HAVE_GZIP Mon Feb 22 08:45:04 2016:
thd-0xed6600: amandad: COMPRESS_SUFFIX=".gz"
COMPRESS_FAST_OPT="--fast"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad:
security_handleinit(handle=0xee18f0, driver=0x7f5a725c99c0 (BSDTCP)) Mon
Feb 22 08:45:04 2016: thd-0xed6600: amandad:
security_streaminit(stream=0xee1ad0, driver=0x7f5a725c99c0 (BSDTCP)) Mon
Feb 22 08:45:04 2016: thd-0xed6600: amandad: authenticated peer name is
'croaker.c.c'
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: accept recv REQ pkt:
<<<<<
SERVICE noop
OPTIONS features=ffffffff9efefbffffffffff1f;
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: creating new service:
noop OPTIONS features=ffffffff9efefbffffffffff1f;

Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: sending ACK pkt:
<<<<<
Mon Feb 22 08:45:04 2016: thd-0xed6600: amandad: tcpm_send_token: data is still flowing Mon Feb 22 08:45:05 2016: thd-0xed6600: amandad: sending REP
pkt:
<<<<<
OPTIONS features=ffffffff9efefbffffffffff1f;
Mon Feb 22 08:45:15 2016: thd-0xed6600: amandad: timeout Mon Feb 22
08:45:15 2016: thd-0xed6600: amandad: sending REP pkt:
<<<<<
OPTIONS features=ffffffff9efefbffffffffff1f;
Mon Feb 22 08:45:15 2016: thd-0xed6600: amandad: tcpm_send_token: data is still flowing Mon Feb 22 08:45:25 2016: thd-0xed6600: amandad: timeout Mon
Feb 22 08:45:25 2016: thd-0xed6600: amandad: sending REP pkt:
<<<<<
OPTIONS features=ffffffff9efefbffffffffff1f;
Mon Feb 22 08:45:35 2016: thd-0xed6600: amandad: timeout Mon Feb 22
08:45:35 2016: thd-0xed6600: amandad: sending REP pkt:
<<<<<
OPTIONS features=ffffffff9efefbffffffffff1f;
Mon Feb 22 08:45:35 2016: thd-0xed6600: amandad: tcpm_send_token: data is still flowing Mon Feb 22 08:45:45 2016: thd-0xed6600: amandad: timeout Mon
Feb 22 08:45:45 2016: thd-0xed6600: amandad: sending REP pkt:
<<<<<
OPTIONS features=ffffffff9efefbffffffffff1f;
Mon Feb 22 08:45:55 2016: thd-0xed6600: amandad: timeout Mon Feb 22
08:45:55 2016: thd-0xed6600: amandad: timeout waiting for ACK for our REP
Mon Feb 22 08:45:55 2016: thd-0xed6600: amandad:
security_close(handle=0xee18f0, driver=0x7f5a725c99c0 (BSDTCP)) Mon Feb 22
08:45:55 2016: thd-0xed6600: amandad:
security_stream_close(0xee1ad0)
Mon Feb 22 08:45:55 2016: thd-0xed6600: amandad: tcpm_send_token: data is
still flowing Mon Feb 22 08:46:04 2016: thd-0xed6600: amandad: timeout
exit Mon Feb 22 08:46:04 2016: thd-0xed6600: amandad: pid 21287 finish
time Mon Feb 22 08:46:04 2016


Thanks,

--

Bill Carlson

Anything is possible, given Time and Money.

Reply via email to