RE: I/O EVENTS

Cary Millsap Tue, 21 May 2002 20:03:52 -0700

I have an example for you (Anjo, I hope you won't mind). A prospect we
visited once upon a time had been fighting a performance problem with an
Oracle Payroll program. They "knew" what their problem was: very
clearly, v$system_event was telling them that their overwhelmingly
dominant system performance problem was waits for "latch free." (Well,
no it wasn't. It was waits for the Oracle Payroll program. ;) )


Actually, the v$system_event view indicated clearly that the top wait
event was "SQL*Net message from client" (and some other "SQL*Net %"
stuff), but of course every good DBA knows that you have to discard all
the "SQL*Net" events because they are "idle" events.

Three months of trying and failing to fix the problem led finally to a
last-resort, desperation hardware upgrade. Twelve 700MHz CPUs became
twelve new 1,024MHz CPUs. (Don't ask why "waits for latch free" was
translated as "need to upgrade CPUs.") And unfortunately, the Payroll
program got *slower*. Not just "seemed slower"--*was* slower.

After the upgrade failed to help, these guys finally let us come see
them. We traced the Payroll process (10046, level 8), very carefully
initiating the trace immediately after the initiation of the Payroll
run, and very carefully terminating the trace immediately after the
completion of the Payroll run. The key here is that every second of data
in the trace file was a second of somebody's response time.

The result was a really fun story. The thing ran in about 33 minutes.
Here's a summary of what was in the session's trace file:

Oracle Kernel Event                      Duration    Calls        Avg

------------------------------ ------------------ -------- ----------
SQL*Net message from client       984.01s   49.6%   95,161  0.010340s
SQL*Net more data from client     418.82s   21.1%    3,345  0.125208s
db file sequential read           279.54s   14.1%   45,085  0.006200s
CPU service                       248.69s   12.5%  222,760  0.001116s
unaccounted-for                    27.73s    1.4%    
latch free                         23.69s    1.2%   34,695  0.000683s
log file sync                       1.09s    0.1%      506  0.002154s
SQL*Net more data to client         0.83s    0.0%   15,982  0.000052s
log file switch completion          0.28s    0.0%        3  0.093333s
enqueue                             0.25s    0.0%      106  0.002358s
SQL*Net message to client           0.24s    0.0%   95,161  0.000003s
buffer busy waits                   0.22s    0.0%       67  0.003284s
db file scattered read              0.01s    0.0%        2  0.005000s
SQL*Net break/reset to client       0.00s    0.0%        2  0.000000s
------------------------------ ------------------ -------- ----------
Total                           1,985.40s  100.0%  

Well, waits for "latch free" accounted for only 1.2% of the total
response time. If the DBAs had been completely successful in eliminating
"latch free" waits from their system, it would have made only a 1.2%
difference in the runtime of this program. Point 1: <emphasis>This
program's bottleneck is not the same as its system's system-wide average
bottleneck.</emphasis> This happens all the time. You can NOT see this
type of problem when you look only at system-wide average statistics.

You probably didn't pay much heed to the first thing you should have
noticed: slightly more than 70% of this program's runtime was consumed
by so-called "server idle" time. ??! You can't ignore this time, even if
people do call it "idle." Idle or not, this time was part of someone's
response time, so we need to deal with it. Now if we hadn't collected
our statistics so carefully (with proper time scope and proper program
scope), then we would have seen probably much more "SQL*Net message from
client" time in our data. If you make that particular collection error,
then you *have* to disregard the so-called "idle" wait time.

What was the Payroll program's problem? A couple of things. First, we
knew that the Payroll program was running on the same host as Oracle.
But the latency per call on the "SQL*Net message from client" event
looked suspiciously LAN-like (order of 10ms), not IPC-like (order of 1ms
or less). So we checked the tnsnames.ora file. Turns out that to
conserve administration effort, the DBAs had decided to use a single
tnsnames.ora file system-wide. So the Payroll program was using the same
TCP/IP protocol that the system's PC clients were using. Just by
switching the protocol to BEQ instead of TCP in the db server's
tnsnames.ora file (a change that took about 15 minutes to test, 20
seconds to implement, and about a week to navigate through change
control), we eliminated about 50% of the program's runtime. 

Next, what's with the "SQL*Net more data from client"? The developers of
this particular Payroll program did a pretty bad thing in their code:
they passed several really long (10KB) SQL texts from the client to the
server in their parse calls (oh yeh, and with no bind variables either).
Anyway, each time the Oracle kernel received a packet containing only
part of a SQL statement, it tallied to "SQL*Net more data from client"
while it waited for the next piece. Of course, communicating via the
network interface card instead of IPC didn't help, but passing huge SQL
statements from the app to the db is bad form to begin with (see Ixora
for more detail on just how bad this is).

Finally, realize that Oracle tallies to "SQL*Net message from client"
pretty much every time it's "between" working on db call fulfillment.
Therefore, there were on the order of 95,161 db calls during this
33-minute period. Looking at the SQL, we discovered that of course,
since the app developers didn't use bind variables, there was a parse
call for almost every execute. This accounted for thousands of
unnecessary parse calls. Next, each execute call manipulated only zero
or one row, which added up to dozens of execute calls for each single
SQL statement with a multi-row result source. The program could have
performed the business function with a fraction of the db calls.

Oh yes... Why did the Payroll program get *slower* after the upgrade?
Little Payroll program time was spent using CPU, so the upgrade had very
little direct positive effect upon the program. Most of the program's
time was spent waiting for network latency. Other programs ran at the
same time as this Payroll job. The CPU upgrade made *them* faster, which
intensified *their* constant number of network calls into a smaller time
window. The result was increased competition for the network during the
Payroll run. Therefore, every network I/O call the Payroll call made was
a little slower than before the CPU upgrade. The increase in network
response time overwhelmed the tiny direct improvement of the CPU time
reduction, resulting in a net degradation of Payroll performance. ...Not
a good thing, because this Payroll program had a higher business
priority than everything else on the system at the same time.

The End.

* * *

So, some wrap-up:

1. Looking at v$session_event is better than looking at v$system_event
for these things. When you collect system-wide data to solve a single
program's performance problem, your irrelevant data can bury your
relevant data. We see this all the time. Any time you query
v$system_event and ask about the top-line events you see, your
vulnerable to this problem.

2. Furthermore, looking even at v$session_event tells you NOTHING about
the user's experience. When a v$ query tells you that there's "SQL*Net
message from client" time, you can't know without more data whether this
stuff showed up in someone's response time, or if it corresponded to
time when the USER was actually idle.

3. Another important reason that looking at v$session_event data can
mislead you is that there's no record of CPU time in there. To find
that, you have to go to something like v$sesstat. The "CPU used by this
session" statistic is supposed to give you the CPU time you need, but
various bugs prevent you from getting what you need in many cases. 

4. Response time = CPU from sesstat + waits from session_event + an
unaccounted-for gap in time. Lots of people believe that the "gap" is a
deficiency of 10046 data: it's not. The irony is that the same
deficiency is even worse in v$ table methods; you just can't tell that
there even *is* a gap: you can't compute gap=R-(cpu+waits), because with
v$-based methods you don't know what R is. This "gap" is one of those
things that we've discovered how to interpret reliably after some work.
For more information, see some of our Hotsos Profiler documentation at
www.hotsos.com/dnloads/1/kevents and click "unaccounted-for" (the
document is free, but you have to register). (Bear with us, most of the
event documentation on our kevents page is pre-beta, but the
"unaccounted-for" document is pretty much complete.) 

5. How Precise gets around all those problems is by polling Oracle
shared memory (the things underneath the v$ tables) with extraordinary
frequency in a "lightweight way" (directly from memory). How Hotsos gets
around them is by using 10046 level-8 trace data. But to use 10046, you
must be able to predict that a performance problem will occur in the
near future (for example, because it's repeatable). We teach both data
collection approaches in our Clinic, but we have relied upon 10046-based
analysis exclusively (using our own Hotsos Profiler tool) since 2000.

6. No matter how you collect the data, as long as you do it correctly,
the response time-based optimization method works with extraordinary
efficiency. Since about 1995, Anjo (whom I regard to be the Father of
the response-time based Oracle optimization method) has been able within
a few minutes to find and fix performance problems that have evaded
detection for months or even years. It's rare anymore for any of us to
spend more than about an hour diagnosing and repairing a performance
problem once we have the right data.

...But the hit ratio stuff lives on, I think for two reasons: (1) Not
enough people understand the response-time based diagnostic methods yet
(but we're working on that!); and (2) Ratio-based techniques, when
combined with time & materials-based billing contracts, are extremely
effective at maximizing consulting revenue.

 
Cary Millsap
Hotsos Enterprises, Ltd.
[EMAIL PROTECTED]
http://www.hotsos.com

Next Hotsos Clinic events
June 11-13 - Los Angeles
July  9-11 - New York


-----Original Message-----
Sent: Tuesday, May 21, 2002 2:13 AM
To: Multiple recipients of list ORACLE-L

> So just looking at v$system_event is dangerous.
> Looking v$system_event and v$sysstat
> is much better but still not perfect. The third
> way is ....... (mail me ;-))

Anjo,

Jared has an even hand on the tiller.  Go ahead an post an informative
example of how the v$ views don't allow you to diagnose a specific
problem,
and how Precise products do.  We wanna know, and I don't imagine he'll
mind.

These little hints, and half sentences that end with an ellipsis, are
getting on my nerves!  ;-)

- Greg

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Greg Moore
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Cary Millsap
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

RE: I/O EVENTS

Reply via email to