Hello, Al. We have much to share. We are OS/390 2.10 on a 7060-H50 (~120 MIPS) with approx. 100Mbit network connectivity and have had many, long-standing TSM performance problems. We are currently running 4.2.3.2. We have discovered in working with TSM support (without a technical explanation as to "why?") that reducing the TSM server's region to 512M and setting (by reducing) bufpoolsize to 131072 (i.e., 128MB) works for us. We had previously tried several region settings from 1.75G down to 960M with the same problematic results until "happening upon" the severely reduced, storage-constrained "settings" with which we are now running (or should I say, limping). This was determined with the help of the Tivoli "performance team" in response to a long string of numerous performance-related PMRs.
Here are some things we have discovered - and which work best for us: 1. Region over 512M causes serious and pervasive performance problems 2. BufPoolSize much over 131072 MAY also cause/contribute similarly (and definitely doesn't help) 3. CPU utilization is VERY high for any database-intensive processes 4. Database corruption may be the root cause for our severe symptoms (this is purely conjecture on my part at this point, but supported, to some degree, by TSM support statements recommending we fix known DB corruption - which, of course, with dump/reload/audit performance being what it is, is an impossible "hit" to take). FYI: We plan to "move out" of the TSM server with database corruption "into" a new, virgin server(s) as soon as time and other factors permit. Prior to adjusting our "settings" as indicated above, we were experiencing severe, pervasive, and nearly continual performance problems (and CPU over-utilization), server unresponsiveness, and what I would call "stress-related" failures of all sorts, and a whole plethora of other, unmentioned "problems". After making "the adjustments" we have found that, although the TSM server still frequently gets "tangled up in its shorts", the problems are not as severe nor are they as frequent or pervasive, and performance is better than when we ran it in the "larger memory footprint". Although it is closer to acceptable, it is still well below the kind of performance I expect from an application running on the platform (i.e., S/390). We cannot even imagine a reason why these adjustments have helped, but they have. It is totally counter-intuitive to me that reducing the memory footprint would yield these results, but it has. I would call IBM/Tivoli support, if I were you, and start a diagnostic regimen with them on your particular issues. We were told by them that many OS/390 shops are getting far superior performance, throughput, and (I presume) a much better CPU utilization picture than we experience. Further, their stated position is that some environmental factor, unique to "us", is the root cause for our performance issues. Aside from our limited bandwidth and database corruption "issues", I cannot think of any other factor that makes us extremely unique among all the other users of the TSM server on OS/390. You are the first shop I have heard reporting an experience similar to ours. Please feel free to explore this further with me off-line if you wish. Regards, Mark Darby (301) 903-5229 -----Original Message----- From: Alan Davenport [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 12, 2003 10:46 AM To: [EMAIL PROTECTED] Subject: OS390 TSM Performance questions. Hello, We're running TSM v5.1.5.4 on an IBM 20660A2 processor running OS390 v10. There is a 100Mbit, single port OSA card on the processor. We are backing up 197 clients per night. MAXSCHEDSESSIONS is set to allow 116 simultaneous backup sessions. Our backup window begins at 20:00 and ends at 07:30 the next morning. We are seeing poor performance on our backups during the window. For example, one server that will backup in 6-7 minutes outside the window takes hours to complete during the window. The TSM server has a region size of 1280M and MPTHREADING is set to YES. Self tune buffer size and TXN size is enabled. We are backing up to a 100GB disc buffer to an EMC model 8830 drive array. On average we backup 30-40GB per night with a peak of 75-80GB. I know there are much larger shops backing up many more servers out there running OS390 also. What I would like to know is, on large shops, what is your OSA configuration? Are you running multi-port OSAs and/or gigabit cards? For comparison, I would also like to know how many clients you are backing up per night. Where do you think the bottleneck is? Have you seen similar problems and what did you do to help alleviate the problem? I am fairly confident that TSM is not CPU constrained during the window. We recently moved TSM to a higher service class with little effect on the problem. Do you feel we are saturating the OSA card? Any thoughts and suggestions would be greatly appreciated. Take care, Al Alan Davenport Senior Storage Administrator Selective Insurance Co. of America [EMAIL PROTECTED] (973) 948-1306
