. We run a cron process every 20 minutes that queries the log fullness, and if it's 70% or over it does SHOW LOGPIN and cancels that session. I've found that when a client has the log pinned, it may take quite a while to cancel the session - sometimes as long as an hour. That's why we cancel when the log is 70% full, instead of 99%. These incidents get logged, and we investigate them.
When that happens I usually find that there is a networking problem, and typically it's a half-duplex link somewhere along the line between the TSM client and server. This problem won't be obvious with normal stuff like web browsing and SSH/telnet sessions - these may still appear to work OK. It is typically exposed only by a TSM backup, due to the much greater volumes of data moved. Traceroute cannot detect this. You need to put up an NDT (Network Diagnostic Tool) network bandwidth tester and test this client machine on it. Your networking people might already have one of these, or you can use a public access NDT server. NDT can spot bad links such as ones set to half duplex, very quickly. It can also pinpoint bad cables. More information about NDT, including a list of public access NDT servers, is at http://e2epi.internet2.edu/ndt/. Make sure you are not allowing any clients to back up if they are not on your local net. Client nodes out there on the Internet would pin the log frequently, until we disallowed them. We did this at the router level. ADSL is the worst, due to its smaller upload bandwidth - backup is uploading after all. We tell people with laptops that they can only use TSM when they bring the computer onto campus. Wi-fi links do not appear to cause log pinning, as long as the wireless router is connected directly to our campus network. The other problem that can pin the log is a client backing up a very large file, slowly. We find Macs have more problems in this area than other types of clients, mostly due to the kinds of data people typically process on a Mac. Video files can be enormous. Consider limiting file size. Roger Deschner University of Illinois at Chicago [EMAIL PROTECTED] ====You will take a long journey. Remember to export your variables.==== On Mon, 29 Sep 2008, Richard Sims wrote: >If you're unsure that a posting got distributed, inspect the List >archives to see if your posting made it into circulation. > >First and foremost, if your TSM server is being jeopardized by a >client's behavior, you need to protect the server. You can do 'SHow >LOGPINned Cancel' to terminate sessions or processes which are pinning >the Recovery Log, as described in the TSM Problem Determination Guide >- or simply cancel the session outright. > >Beyond that, someone needs to take a good look at what that client is >doing, relative to healthy transaction processing. Someone could have >set up something unreasonable, perhaps in ignorance of best practices; >or there could be an odd condition causing the client to get stuck >somewhere in the file system. Check back in your Activity Log for ANE >session conclusion messages for that client, to see if it's performing >B/A client work (rather than TDP) and check past session statistics >for a sense of sizes, rates, and duration. This can reveal if what >the client has been doing has been getting more outrageous over time, >or whether the current session is anomalous. If the networking seems >ploddingly slow from the stats, that can get fixed. It needs >analysis. Talk to the client admin and see if they made recent >changes, or are aware of unusual data activity. > > Richard Sims http://people.bu.edu/rbs/ >
