Hi all,
I get intermittent client time-outs when running amdump. Has anyone
solved an intermittent timeout? Does anyone know why amanda chooses the
communications ports that gets chosen? Clients are OpenBSD, FreeBSD, and
a RH Linux 7.2. The server is running amanda 2.4.2p2 on RH linux 7.2.
I have upgraded all the Gnu tar to be at least 1.13.19. Each run is
level 0.
For testing purposes I have clipped all the clients except for a FreeBSD
4.5 client with 4 partitions (none larger than a Gb). This client has
ipfw firewall running, but is open for the amanda server.
When running amdump I usually get one partition timing out. The
partition varies. When dumping all of the partitions, it is a
particular partition. When dumping all of the partitions except for the
particular partition, it is a different partition, with the same error.
I have not yet tried omitting more than one partition. The dump has
gone through without a partition timing out, but that is Very rare.
amcheck does not show any errors.
Looking through the logs, the server tries to contact the client. The
first 2 index port sendbackup stream_server gets contacted fine, but the
3rd times out. e.g. (x's inserted and on the server side):
dumper: stream_client: connected to 128.x.x.x.10081
dumper: stream_client: our side is 0.0.0.0.10080
dumper: stream_client: connected to 128.x.x.x.10082
dumper: stream_client: our side is 0.0.0.0.10081
dumper: stream_client: connect(3496) failed: Connection timed out
driver: result time 826.558 from dumper0: TRY-AGAIN 01-00006 [could not
connect to index port: Connection timed out]
This happens nearly every time in a backup. I have compiled amanda to
include a portrange between 10080 and 10085, so I'm not sure why amanda
chose port 3496. Usually the timeout involves a port somewhat between
3000 and 6000. The odd thing is that the client debug files indicate
that the client is waiting on the same port. e.g. (on client side, x's
inserted):
sendbackup: stream_server: waiting for connection: 0.0.0.0.10081
sendbackup: stream_server: waiting for connection: 0.0.0.0.10082
sendbackup: stream_server: waiting for connection: 0.0.0.0.3496
waiting for connect on 10081, then 10082, then 3496
sendbackup: stream_accept: connection from 128.x.x.x.10080
sendbackup: stream_accept: connection from 128.x.x.x.10081
sendbackup: stream_accept: timeout after 30 seconds
sendbackup: timeout on index port 3496
Normally, I would think that there's some sort of firewall keeping the
two from communication on any port other than 10080-10085, but I am
certain that the firewall would allow this. There is no router or other
such device between the two. The amanda.conf include these timeouts:
etimeout 1000
dtimeout 27600
ctimeout 2700
I have plenty of bandwidth, dumpers =10 (for 4 partitions), plenty of
room on my holding disk, using gnutar, indexing...
So, Does anyone know why amanda would choose such a port as 3496 after
being compiled --with-portrange=10080,10085, or how to get rid of this
error?
--
Lalo Castro
Programmer/Analyst
McHenry Library
(831) 459-5208