> We are running TSM Version 4.1 on AIX 4.3.3, both Server and client. > > We have one node that for some reason has started taking a lot longer to > complete its backup than any of the other similar nodes. > > When examining various log files we have found that the process seems to stop > for 7 - 8 hours and then all of a sudden kicks off again. > > Here is an exerpt from various logs ...
I have seen the kind of behavior shown in the logs on client systems that backup directly to tape. The trouble occurs when the client decides that a TCP/IP connection has failed and the server does not. The client starts a new session and the server decides to put the data on the same tape the old session was using. From the server's point of view the old session is still using the tape, and the new session is forced to wait for the tape to become available. Eventually the server kills off the old session when it reaches the idle time limit. At that point the new session is allowed to start writing on the tape. When we have seen this there have almost always been other signs that the network infrastructure was malfunctioning. We have never had a client system show this failure mode day after day.
