Problem/cause solved - it is not the issue with systemstate backups. It is a new app with a HUGE (142GB) log file that is constantly changing (folks always forget CHANGINGRETRIES 0 which IMHO should be the default). It is taking 3-hours for each compress/send pass of this one file.
Anybody know what the heck WEBINATOR (thank you Ahhhhhhnold for all products that are named *NATOR) is? Something tells me that a 142GB log file isn't normal but then I am not the product owner/administrator. 03/09/2010 02:47:08 --- SCHEDULEREC OBJECT BEGIN WEBSERVERS 03/09/2010 01:00:00 03/09/2010 02:47:09 Incremental backup of volume '\\adata\c$' 03/09/2010 02:47:09 Incremental backup of volume '\\adata\d$' 03/09/2010 02:47:09 Incremental backup of volume 'SYSTEMSTATE' 03/09/2010 02:47:16 Performing a full, TSM backup of object 'COM+ REGDB Writer' component 'COM+ REGDB' using shadow copy. 03/09/2010 02:47:16 Performing a full, TSM backup of object 'Registry Writer' component 'Registry' using shadow copy. 03/09/2010 02:47:16 Performing a full, TSM backup of object 'System Writer' component 'System Files' using shadow copy. 03/09/2010 02:47:16 Performing a full, TSM backup of object 'Event Log Writer' component 'Event Logs' using shadow copy. 03/09/2010 02:47:16 Performing a full, TSM backup of object 'IIS Metabase Writer' component 'IISMETABASE' using shadow copy. 03/09/2010 02:47:16 Performing a full, TSM backup of object 'WMI Writer' component 'Windows Managment Instrumentation' using shadow copy. 03/09/2010 02:47:25 Comparing local system files with active version on the server. This may take a long time depending on the number of files... 03/09/2010 02:47:25 Directory--> 0 \\adata\c$\WINDOWS\repair\Backup\BootableSystemState\ComRegistrationDatabase [Sent] [snippage] 03/09/2010 02:48:02 Backup of object 'SystemState' component 'System State' finished successfully. 03/09/2010 02:48:04 Successful incremental backup of 'ADATA\SystemState\NULL\System State\SystemState' 03/09/2010 02:48:04 Normal File--> 3,626 \\adata\c$\adsm.sys\ASR\asr.sif [Sent] 03/09/2010 02:48:04 Normal File--> 16,044 \\adata\c$\adsm.sys\ASR\asrpnp.sif [Sent] 03/09/2010 02:48:04 Normal File--> 2,486 \\adata\c$\adsm.sys\ASR\tsmasr.cmd [Sent] 03/09/2010 02:48:04 Normal File--> 1,470 \\adata\c$\adsm.sys\ASR\tsmasr.opt [Sent] 03/09/2010 02:48:04 Normal File--> 15,384 \\adata\c$\adsm.sys\ASR\waitforevent.exe [Sent] 03/09/2010 02:48:04 Successful incremental backup of 'Automated System Recovery' 03/09/2010 02:48:04 Directory--> 0 \\adata\d$\Webinator [Sent] 03/09/2010 02:48:04 Normal File--> 1,816 \\adata\d$\Webinator\license.key [Sent] 03/09/2010 02:48:04 Directory--> 0 \\adata\d$\Webinator\Texis\ada [Sent] 03/09/2010 02:48:04 Directory--> 0 \\adata\d$\Webinator\Texis\dlkmstest Changed 03/09/2010 02:48:04 Retry # 1 Directory--> 0 \\adata\d$\Webinator [Sent] 03/09/2010 02:48:04 Retry # 1 Normal File--> 1,816 \\adata\d$\Webinator\license.key [Sent] 03/09/2010 02:48:04 Retry # 1 Directory--> 0 \\adata\d$\Webinator\Texis\ada [Sent] 03/09/2010 02:48:05 Retry # 1 Directory--> 0 \\adata\d$\Webinator\Texis\dlkmstest [Sent] 03/09/2010 02:48:05 Directory--> 0 \\adata\d$\Webinator\Texis\vcu Changed 03/09/2010 02:48:05 Retry # 2 Directory--> 0 \\adata\d$\Webinator\Texis\dlkmstest [Sent] 03/09/2010 02:48:05 Retry # 1 Directory--> 0 \\adata\d$\Webinator\Texis\vcu [Sent] 03/09/2010 02:48:07 Normal File--> 13,245,910 \\adata\d$\Webinator\Texis\monitor.log [Sent] 03/09/2010 06:05:14 Normal File--> 153,358,399,985 \\adata\d$\Webinator\Texis\vortex.log Changed 03/09/2010 09:57:06 Retry # 1 Normal File--> 153,358,411,322 \\adata\d$\Webinator\Texis\vortex.log Changed 03/09/2010 13:40:09 Retry # 2 Normal File--> 153,358,459,153 \\adata\d$\Webinator\Texis\vortex.log Changed [snippage] From: David McClelland <[email protected]> To: [email protected] Date: 03/08/2010 10:32 AM Subject: Re: [ADSM-L] Long running backup pinning log Sent by: "ADSM: Dist Stor Manager" <[email protected]> Zoltan, It's the large gap between Systemstate backup finished and d$ finished that interests me. Particularly as '72,395 total objects inspected' doesn't seem like a very large number (for a Windows server, that's probably not a whole lot more than the standard system files), and almost certainly not a candidate to gain maximum benefit from journaling. The ANE4691 message you paraphrase is "Total number of bytes transferred" which has always suggested to me 'after compression'. I'd definitely have a look at what's really going on in verbose mode, checking retries and also how slow/fast your network data path is to the TSM Server and whether the 93.82GB is simply taking an awfully long time to transfer. 87% compression is remarkably good which I also find suspicious: is the client attempting to back up large sparse open database files on the D$ by any chance through a slow network connection? /DMc London, UK -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Zoltan Forray/AC/VCU Sent: 08 March 2010 14:53 To: [email protected] Subject: Re: [ADSM-L] Long running backup pinning log Thanks for the heads-up. Unfortunately, from reading the APAR, this box does not seem to have the symptoms. There is no significant gap in backing up each component of systemstate, unless this is hidden because of QUIET. We have changed this to VERBOSE to get more details on where it is slogging. >From what I see in the dsmsched log, this is what the last backup that finished (todays is still running), says: 01:43 - Backup started 01:47 - Systemstate backup finished 12:21 - d$ drive backup finished 12:25 - c$ drive backup finished Total objects examined - 72,395 Total objects backed up - 1,725 Total data backed-up (to paraphase) - 93.82Gb I did note the "Object compressed by" is 87%. Which brings me to the question - Is the "total amount backed up" after compression? My guess it lots of time spent in compression (IBM - how about some compression algorithm controls (low - medium - high). Haven't ruled out "retries" - will see when verbose is on. I don't see journaling making that much of a difference. The server has 2GB RAM of which 900M was free when I checked it this morning, so it doesn't seem to be constrained. The backup is currently running and the dsm service was using 10-20% CPU. Zoltan Forray TSM Software & Hardware Administrator Virginia Commonwealth University UCC/Office of Technology Services [email protected] - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html From: Andrew Raibeck <[email protected]> To: [email protected] Date: 03/05/2010 04:32 PM Subject: Re: [ADSM-L] Long running backup pinning log Sent by: "ADSM: Dist Stor Manager" <[email protected]> Hi Zoltan, Is the long-running backup working on system state? I haven't found a direct cause --> effect relationship to client APAR IC63094, but take a look at that APAR and see if, after applying the fix, the problem you are seeing goes away. I should mention to all: IC63094 describes a problem with long-running system state backups. This APAR may be of interest to you. Best regards, Andy Andy Raibeck IBM Software Group Tivoli Storage Manager Client Product Development Level 3 Team Lead Internal Notes e-mail: Andrew Raibeck/Hartford/i...@ibmus Internet e-mail: [email protected] IBM Tivoli Storage Manager support web page: http://www.ibm.com/support/entry/portal/Overview/Software/Tivoli/Tivoli_Stor age_Manager The only dumb question is the one that goes unasked. The command line is your friend. "Good enough" is the enemy of excellence. "ADSM: Dist Stor Manager" <[email protected]> wrote on 2010-03-05 16:26:01: > [image removed] > > Long running backup pinning log > > Zoltan Forray/AC/VCU > > to: > > ADSM-L > > 2010-03-05 16:26 > > Sent by: > > "ADSM: Dist Stor Manager" <[email protected]> > > Please respond to "ADSM: Dist Stor Manager" > > I recently started having issues with my log going over 80% (5.5.3). Doing > a "show logpinned" usually points to a long (18-hours) running backup > session for one particular node. > > This is starting to happen more and more frequently. > > Once I kill the session, the log utilization drops to almost nothing since > there has been a DB backup run earlier in the day. > > Any suggestions on how to handle this besides killing the session? The > node is a standard 2K3 box using the 6.1.2.0 client. It is just a very > slow/busy box. The long backups (times vary from 11-19 hours) often dump > 120GB of data. > Zoltan Forray > TSM Software & Hardware Administrator > Virginia Commonwealth University > UCC/Office of Technology Services > [email protected] - 804-828-4807 > Don't be a phishing victim - VCU and other reputable organizations will > never use email to request that you reply with your password, social > security number or confidential personal information. For more details > visit http://infosecurity.vcu.edu/phishing.html
