I've got a client who's failing to restore an Exchange store, and he's not really getting any suggestive errors; this kind of surprises me, I'm used to the problem being pretty nicely pointed to, one way or another.
Windows Server 2003 R2 SP1 Exchange agent Version 5, Release 4, Level 0.2 My TSM server is Version 5, Release 3, Level 4.0 The full transfers neatly, and then when he starts working on applying incrementals, the following happens: He sees (e.g.) 06/14/2007 16:01:22 ANS1017E Session rejected: TCP/IP connection failure while I see (these are the enclosing lines which mention the target server) 06/14/07 15:47:22 ANE4991I (Session: 28805, Node: EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3506 Data Protection for Exchange: Starting full restore of storage group Housing Storage Group 1 to serv er EXCHANGETEST.(SESSION: 28805) [...] 06/14/07 15:59:42 ANR0481W Session 28805 for node EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) terminated - client did not respond within 60 seconds. (SESSION: 28805) [...] 06/14/07 16:01:23 ANR0406I Session 28839 started for node EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) (Tcp/Ip x47-215.housing.ufl.edu(8801)). (SESSION: 28839) 06/14/07 16:01:23 ANE4991I (Session: 28839, Node: EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3506 Data Protection for Exchange: Starting incr restore of storage group Housing Storage Group 1 to server EXCHANGETEST.(SESSION: 28839) 06/14/07 16:01:23 ANE4993E (Session: 28839, Node: EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3508 Data Protection for Exchange: incr restore of storage group Housing Storage Group 1 to server EXCHANGETEST failed, rc = 425.(SESSION: 28839) And then there are another 50-100 iterations of the "Starting incr restore / failed with rc = 425" The rc=425 doesn't appear to correlate with either an API return code or an agent code. The 3508 doesn't enlighten much, sounds like it may be a RC from the exchange server? In any case, I don't have any rejected connections to the box, so it looks to him like I rejected, and looks to me like he went away. I'm thinking that the connect failure is in fact a delayed message from whenever the exchange client got busy (say at 15:58:42, 60 seconds before the termination server message) and a few minutes later the agent gets around to sending packets down that pipe, and finds a failure. I understand the agents to be notoriously bad at picking up the connections, and googling around found similar errors to be well addressed by increasing COMMTIMEOUT on the server. I increased my COMMTIMEOUT to a few hours, to be sure I wasn't blowing him off unnecessarily, and he started failing with a 06/15/07 11:51:45 ANR0480W Session 31503 for node EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) terminated - connection with client severed. (SESSION: 31503) at "about" the same point in the process. He reconnects, and then 06/15/07 11:51:45 ANR0406I Session 31845 started for node EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) (Tcp/Ip x47-215.housing.ufl.edu(2432)). (SESSION: 31845) 06/15/07 11:51:45 ANE4991I (Session: 31845, Node: EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3506 Data Protection for Exchange: Starting incr restore of storage group Housing Storage Group 1 to server EXCHANGETEST.(SESSION: 31845) 06/15/07 11:51:45 ANE4993E (Session: 31845, Node: EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3508 Data Protection for Exchange: incr restore of storage group Housing Storage Group 1 to server EXCHANGETEST failed, rc = 425.(SESSION: 31845) I don't think there's an option analogous to COMMTIMEOUT for the client side, am I missing something? I'm gearing up to open a ticket, but I figured I'd send up a flare Just In Case. - Allen S. Rout
