As many others have said, running 100% busy is not terribly unusual. There are 
efficiencies issues that may come into play, but that's a deeper discussion 
than we probably need to get into here. (You will likely find that individual 
workloads will consume somewhat less CPU when running at say 80% busy vs. 100% 
busy due to CPU cache utilizations.)

Don't IPL just because the CPU utilization is high. Look for what is driving 
the utilization (RMF or another performance monitor can help you here), and 
consider whether that thing is doing useful work or not. If not, maybe you want 
to get rid of or cancel that particular workload. It is of course entirely 
possible that if you have something looping or otherwise consuming more CPU 
than it should that it will increase the CPU contention and slow down lower 
importance workloads. If the looping (or just busy, perhaps doing useful work) 
workload is of higher importance than the batch work, you would expect that.

I see there is a fear that this "may create impact during peak season". To 
determine this you really need to take a look at the all the workloads on the 
system, make sure that they all have appropriate importance levels set in WLM, 
and make sure that the WLM goals are reasonable overall. Pay special attention 
to the workloads that people are worried may be impacted. If everything is 
designated appropriately, then you can move into the phase of performance 
analysis and tuning and capacity planning.

If there are workloads that are not meeting your goals, determine where their 
bottlenecks are. If, for example, you have a batch job that spends most of it's 
time waiting on I/O, a CPU discussion is mostly moot. (The RMF III DELAY panel 
can be useful here.) If the job is spending most of it's time waiting on CPU, 
then look at both how the job is using CPU (a job that uses less CPU has less 
opportunity to wait, however even tiny CPU consumers can wait 100%), and look 
at all the equal and higher importance workloads that are consuming CPU. If you 
can tune any of those (or possibly reschedule them) such that they use less CPU 
at the problem time, then that will help the CPU contention issues. (Beware RMF 
III's "primary reason", which may be technically accurate, but may not be 
useful relative to which work is unexpectedly using more than it should.)

If you find you're short of resources (such as CPU), and you can't reduce the 
CPU consumption of anything, and your important business goals are not being 
met, then you may find that you truly need to add capacity. But it should be a 
long ways from "I have a few batch jobs that seem to run long" to "I need to 
add capacity".

That's how I would go about things. 

Scott Chapman

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to