In need of cathartic release, I find myself sharing my
pain with the folks that will understand it best. :)
The past two weeks have really been 'fun'.
Last week we attempted to install an upgraded version of
a rather vital manufacturing application that is used to control
our shop floor.
We backed out the upgrade and re-installed the previous
version ( and requisite database restores ) because the reponse
time on some views was abysmal. This 4 day adventure included
one 19 hour day. Some fun, eh?
We replicate several tables from the app to another database for
real time reporting needs. These snapshots are refreshed every
minute, which has been working quite well for us for 10 months.
( a brief aside: The reason for replicating the data is so that we
can query near real time data without bringing down database
performance in production, as this is the app from hell )
The upgraded version replaces several of those tables that we
replicate with views. So now instead of replicating table XYZ, I
have view XYZ and must replicate 8 tables. The structure of these
views made them unsuitable for fast refresh snapshots.
Here's the WHERE clause from one of the views:
SELECT *
FROM
T1 A, T2 X, T3 ZZ, T4 D,
T5 E, T6 J, T7 K, T8 PP
WHERE A.ProductID = X.ProductID (+)
AND A.SerialNo = X.SerialNo (+)
AND A.PRODUCTID = ZZ.ID
AND A.ProductID = D.ProductID (+)
AND A.SerialNo = D.SerialNo (+)
AND D.OrderContentID = e.ID (+)
AND A.ProductID = J.ProductID (+)
AND A.SerialNo = J.SerialNo (+)
AND J.WipContentRefID = K.ID (+)
The table names have been changed to keep me out of trouble.
Lovely eh?
This was supposedly tested, but I think the testing was done
with 'test' data. You know, 20 rows in each table.
So we make plans to do it again this weekend ( Feb 22-23 )
I am fortunate to have on retainer a very good local DBA. We thought
it might be a good idea to get as many eyeballs as possible looking at
these problems. This guy used to write optimizer code for Oracle, so
he got to tune the views. :)
After his initial shock at seeing the 'code' in these views, he got to work.
After removing outer joins, eliminating unncessary tables and creating a
couple indexes, the views worked pretty well.
There was one still problem though. Trying to create an 'explain plan' on
these views might sometimes take minutes. It seems the optimizer couldn't
make up it's mind as to how best to execute the query. These views didn't
just have 8 way outer joins, there are also inline views in the SELECT
portion of several of them.
Here for your amusement, is the timing portion of a small set of
transactions. < 2 seconds to execute, 5.5 minutes to parse the views.
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS
call count cpu elapsed disk query current
rows
------- ------ -------- ---------- ---------- ---------- ----------
----------
Parse 430 384.49 383.42 0 0 1
0
Execute 268 0.41 0.49 16 1782 151
54
Fetch 216 1.04 1.05 1 14919 420
108
------- ------ -------- ---------- ---------- ---------- ----------
----------
total 914 385.94 384.96 17 16701 572
162
Misses in library cache during parse: 177
Oracle fortunately has a optimizer parameter that helps. We
set optimizer_max_permutations=1000, and the pace picked
up considerably. ( take *that*, SQL Server )
If that were the only problem, I would have happily moved on to the next
crisis, but alas, that was not to be.
Earlier in the week I discovered that there were 1352 unindexed Foreign Key
constraints in the application schema. No, that is not a typo. This is in
addition to the 500 or so FK constraints that *are* indexed.
I mentioned that this might be a problem. Well, it did turn out to be so.
This morning numerous ORA-60 deadlocks were appearing. The deadlock
graph looks like this:
Deadlock graph:
---------Blocker(s)--------
---------Waiter(s)---------
Resource Name process session holds waits process session holds
waits
TM-00000fb4-00000000 14 33 SX SSX 22 240 SX
SSX
TM-00000fb4-00000000 22 240 SX SSX 14 33 SX
SSX
session 33: DID 0001-000E-00000002 session 240: DID 0001-0016-00000002
session 240: DID 0001-0016-00000002 session 33: DID 0001-000E-00000002
Rows waited on:
Session 240: no row
Session 33: no row
===================================================
SSX locks on a TM (table) indicate that there are likely unindexed FK
constraints in the database.
Big surprise. I indexed the few that were causing the immediate problem, and
am now planning to index the remaining 1349.
This isn't a plea for help, just commiseration. :)
Now, on to the next crisis.
Jared
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Jared Still
INET: [EMAIL PROTECTED]
Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).