>> Actually, we're not using journaling on the server in question. The journal >> service hasn't been installed.
The only other possibility is that you are using Named Pipes for communicating with the TSM server (verify by checking CommMethod option in dsm.opt). The error (rc 233) indicates that the server broke the connection for some reason which should have caused the client to try to reconnect to the server or terminate the backup if the connection couldn't be re-established. Pete Tanenhaus Tivoli Storage Solutions Software Development email: [EMAIL PROTECTED] tieline: 320.8778, external: 607.754.4213 "Those who refuse to challenge authority are condemned to conform to it" ---------------------- Forwarded by Pete Tanenhaus/San Jose/IBM on 06/10/2004 11:50 AM --------------------------- Please respond to "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] cc: Subject: Re: Long pause in the middle of an NT backup. Actually, we're not using journaling on the server in question. The journal service hasn't been installed. --Matt -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Pete Tanenhaus Sent: Thursday, June 10, 2004 10:12 AM To: [EMAIL PROTECTED] Subject: Re: Long pause in the middle of an NT backup. > dsmerror.log: > > 06/08/2004 01:30:48 win32NpWrite(): Error 233 writing to named pipe This indicates that the backup client has encountered an error sending a journal notification to the journal service. Most likely what happened is that the error occurred while the client was performing the initial non-journal based backup of a drive with an active but not yet valid journal. The journal service probably encountered an error reading a notification from the backup client and severed the connection with the client (check jbberror.log for a return code of 998), and any subsequent notifications sent to the journal service by the client would result in broken connection errors (this is what rc 233 is). Since the client can't send any more notifications the final "Incremental Complete" notification is never sent to the journal service and the journal for the drive never set to the valid state, which means that the next backup of the drive won't be journal based. This problem is addressed by apar IC40627 and will be fixed in the next 5.23 fixtest (check with IBM service for availability). I'm not sure if this answers all your questions as to why the backup is so slow but it should at least answer why journaling isn't working. Pete Tanenhaus Tivoli Storage Solutions Software Development email: [EMAIL PROTECTED] tieline: 320.8778, external: 607.754.4213 "Those who refuse to challenge authority are condemned to conform to it" ---------------------- Forwarded by Pete Tanenhaus/San Jose/IBM on 06/10/2004 10:58 AM --------------------------- Please respond to "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] cc: Subject: Re: Long pause in the middle of an NT backup. I don't remember at what client level the syntax changed, but at client level 5.2 you can add a "testflag instrument:detail" to your dsm.opt and generate a report to dsminstr.report file that details how the session is spending it's time. David >>> [EMAIL PROTECTED] 6/9/2004 3:55:32 PM >>> Thanks for the responses. I'm going to implement a cron job tonight to query the sessions every few minutes to see exactly what the session in question is doing during the "inactive" time. Journaling (as suggested in another message) is currently being tested for some on one of our servers. We had problems with it in the past but are hoping it's improved to the point where it's usable to us. Regarding fragmentation, I hadn't realized that could be related to this behavior, but it does make sense. In fact, we know we're horribly fragmented but are unable to take the necessary downtime to unload/reload the database in the foreseeable future. Thanks! --Matt -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Andy Carlson Sent: Wednesday, June 09, 2004 2:47 PM To: [EMAIL PROTECTED] Subject: Re: Long pause in the middle of an NT backup. One thing that appens during that time is that the TSM server is downloading directory entries to the client, to sort through and decide what to back up. With millions of entries (which it appears you have), it can take awhile. We had a situation with a terribly fragmented DB, where this pause was lasting 10-15 hours on some boxes. We did an unload/load, and they have all but one gone back to "normal" (the one that didn't has 5.5Million entries, and does not have enough memory on the client box). On Tue, 8 Jun 2004, Anderson, Matt wrote: > I'm experiencing behavior on some NT nodes where, in the middle of a > scheduled backup, there will be a long pause where no activity seems > to occur. It would appear to occur after the backup of one drive > finishes, but before the next one starts. For example, see this log > excerpt from this > morning: > > 06/08/2004 01:30:46 ANS1898I ***** Processed 1,911,000 files ***** > 06/08/2004 01:30:47 ANS1898I ***** Processed 1,916,000 files ***** > 06/08/2004 01:30:49 ANS1898I ***** Processed 1,916,500 files ***** > 06/08/2004 01:31:50 Normal File--> 141,971,832 > \\my-server\f$\PerforceETC\Logs\Log.txt Changed > 06/08/2004 01:32:34 Retry # 1 Normal File--> 141,971,832 > \\my-server\f$\PerforceETC\Logs\Log.txt [Sent] > 06/08/2004 01:32:34 Successful incremental backup of '\\my-server\f$' > > 06/08/2004 07:56:10 ANS1898I ***** Processed 1,917,000 files ***** > 06/08/2004 07:56:13 ANS1898I ***** Processed 1,917,500 files ***** > 06/08/2004 07:56:15 ANS1898I ***** Processed 1,918,000 files ***** > 06/08/2004 07:56:17 ANS1898I ***** Processed 1,918,500 files ***** > > There's almost 6.5 hours where no activity was logged in the > dsmsched.log. There is a (possibly) corresponding message in the > dsmerror.log: > > 06/08/2004 01:30:48 win32NpWrite(): Error 233 writing to named pipe > > The client is Windows 2000 running TSM client version 5.2.2.5 The TSM > server is AIX 5.2 running TSM server version 5.2.2.2 > > Any insight that can be provided would be appreciated! > > Thanks, > > Matt Anderson > Backup & Recovery Engineer > > ********************************************************************** > ****** > This email may contain confidential material. > If you were not an intended recipient, > Please notify the sender and delete all copies. > We may monitor email to and from our network. > **************************************************************************** > > **************************************************************************** This email may contain confidential material. If you were not an intended recipient, Please notify the sender and delete all copies. We may monitor email to and from our network. **************************************************************************** **************************************************************************** This email may contain confidential material. If you were not an intended recipient, Please notify the sender and delete all copies. We may monitor email to and from our network. ****************************************************************************
