Allen, I had the same issue on my 5.5. servers backing up 6.2.x.x client system states and was able to remedy the situation by moving my database and log files to faster drives.
Historically, my TSM servers have not had sufficient local storage to accommodate these volumes (as well as cached disk storage) so they lived on an EMC Clariion, which had both SATA and Fibre drives. At some point during a San upgrade it appears that the TSM storage was moved to the slower SATA drives. I had been chasing pinning logs and SLOW system state backups for months and last week as a test we decided to move the TSM database, logs, and disk pools to an EMC Vmax. Since then log usage has not been over 20% and system state backups are completing with almost no lag time. Comparatively, for 3-4 months the log files have been at 85-98 percent utilization every night, which as you probably know usually end up with tasks being canceled to prevent a total system failure. While I realize that everyone may not have the option of faster storage available I thought it important to re-emphasize how the storage layout and disk speed for the TSM database/log storage can affect server operation. Before the test/change system state backups would appear to hang for several hours while calculating what needs to be backed up, and any large files would pin the log. ~Rick Jacksonville, FL. -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Allen S. Rout Sent: Monday, April 16, 2012 2:22 PM To: [email protected] Subject: [ADSM-L] Objects Assigned vs. Your Database. Howdy, TSM folks. So I think I've gotten to the bottom of a performance issue I've been seeing recently (and a crash!) and I wanted to compare notes. Cutting to the chase, I've been seeing obnoxious log consumption on one of my TSM servers every night recently, and once a few months ago it got bad enough that it blew out the log and crashed with the 'gotta extend the log' situation. Routine recovery, but irritating. At the moment, I'm attributing the problem to a growing population of v6 clients which, when doing system state backups, are "reassigning" objects from the previous backup to the current one. Now, I understand why they're doing this: It gets us back into the incremental zone, from the miserable 'system state :== full' situation. But the law of unintended consequences comes in stage left. Since many (most?) of the system state is in fact static, that means that each machine is going to reassign most of its system state. How fast can it do that? How fast is your database? That's how fast. Last night I watched what felt like a normally busy evening, and the log-full percentage was growing before my eyes; as in, wait a minute and see three percent advance, and that's on an 11692MB log. So I've got my trigger set at 60%, but it's blowing through the remaining 40 like nothing. I get to the 'server log is [foo] full, delaying transactions' state on a regular basis. As a band-aid, I'm going to talk to the customers and see if some of this population of machines can rationally be excluded from SYSTEM STATE backups: Offhand, I think their DR plans don't include BMR from my TSM. But that's short-term thinking. What are you-all doing about this? Increasing number of DB incrs between fulls? Something else? - Allen S. Rout
