Howdy, TSM folks. So I think I've gotten to the bottom of a performance issue I've been seeing recently (and a crash!) and I wanted to compare notes.
Cutting to the chase, I've been seeing obnoxious log consumption on one of my TSM servers every night recently, and once a few months ago it got bad enough that it blew out the log and crashed with the 'gotta extend the log' situation. Routine recovery, but irritating. At the moment, I'm attributing the problem to a growing population of v6 clients which, when doing system state backups, are "reassigning" objects from the previous backup to the current one. Now, I understand why they're doing this: It gets us back into the incremental zone, from the miserable 'system state :== full' situation. But the law of unintended consequences comes in stage left. Since many (most?) of the system state is in fact static, that means that each machine is going to reassign most of its system state. How fast can it do that? How fast is your database? That's how fast. Last night I watched what felt like a normally busy evening, and the log-full percentage was growing before my eyes; as in, wait a minute and see three percent advance, and that's on an 11692MB log. So I've got my trigger set at 60%, but it's blowing through the remaining 40 like nothing. I get to the 'server log is [foo] full, delaying transactions' state on a regular basis. As a band-aid, I'm going to talk to the customers and see if some of this population of machines can rationally be excluded from SYSTEM STATE backups: Offhand, I think their DR plans don't include BMR from my TSM. But that's short-term thinking. What are you-all doing about this? Increasing number of DB incrs between fulls? Something else? - Allen S. Rout
