>Below are the two pertinant parts (I think) of an error that is occuring on
>an everyday basis on this server.  ...

What version of Amanda?  Same version on client and server?

>... There are 11 other servers that are being backed up
>just fine but this one just does not want to work.  ...

So what's different about it?  (just kidding :-).

>... Of course it was upgraded because we could not
>find a network card that worked decently ...

Are we talking 100 Mbit Ethernet?  Any chance you have a duplex problem
between the client and switch?  Getting that wrong can cause truly
amazingly bad performance.

>  web45.inte ad0s1f lev 0 FAILED [data timeout]
>  web45.inte ad0s1a lev 0 FAILED [could not connect to web45.internal]
>  web45.inte ad0s1e lev 0 FAILED [could not connect to web45.internal]

The first one says Amanda waited 30 minutes between getting started or
getting a block of data and then gave up.  I'm guessing the other two
failures were because amandad was still running on the client and so
the new connections were not allowed.

If you're running 2.4.2 or beyond, you could increase the dtimeout
value in amanda.conf, but half an hour to wait on data is a long time.
I suspect it means something else is wrong.

It would also be useful to go to the client and look at sendbackup*debug
in /tmp/amanda, in particular the start and stop time (first and last
lines).

The "index tee cannot write [Broken pipe]" stuff is just a symptom of
the server giving up on the client.  The server shut down the connections
and the client was still trying to write, so it got a "broken pipe" error.
The real problem is why the data stream quit moving.

Some other possibilities:

  * An **extremely** busy disk that tar has a hard time getting access
    to.

  * A busy client and you have compression turned on, so it just cannot
    get enough CPU.

  * A very large file system that compresses extremely well so tar and
    gzip are crunching along but not generating any (enough) output.

  * A broken disk that is very slow to respond (e.g. lots of retries).

  * Some kind of networking problem, software or hardware.  You might
    try some big ftp put's from the client to /dev/null on the server
    and see what happens.

>Ryan Williams

John R. Jackson, Technical Software Specialist, [EMAIL PROTECTED]

Reply via email to