This is an interesting (and relatively complex) query with what I think are
several opportunities to tune it. I'd probably spend some time looking at
the following to see if they might help you out:
1) Look at the sub-select with the connect by clause... Try executing
that query on it's own and get an idea of it's execution time and the
number of rows returned for different bind variables. Depending on the
number of distinct values of arctype there may be some scope to optimise
this component. Possibly create a table containing on the arctype 2999999
and 3000000 records and then remove this clause from the query - this could
avoid accessing the table at all. I have no idea if creating such a table
is practical for your scenario though.
2) Consider a concatenated index (perhaps termid, parenttermid or
parenttermid,termid - too early for my brain to remember without trying)
3) Are the distinct and order by clausing really needed. Often a distinct
is included to hide a fault in the query (like a missing join or criteria)
- distinct can be very expensive at times but since your query runs fairly
fast you probably aren't removing many rows. How many rows does the query
return with versus without the distinct clause?
4) In the order by clause is "mt.blast.pvaltonumber(pval)" This looks
like a function call - if you have a way to avoid this function call you
may see a performance increase. You could test this by creating a table
which stores the calculated result already and modify the query (remember
to index and analyze the same as the original table). Does this help? Is
it practical to store the result? Again though, the benefit will be
determined by the number of rows being ordered and the amount of query time
spent doing this - for large data sets a function call is murder though.
5) Finally, I just realised at the last minute that DUALBLASTRESULTS
appears to be a view. Try bypassing the view and going straight to the
base tables with the most restrictive criteria you have. Sometimes Oracle
doesn't handle views really well within queries. I've seen improvements
where the entire logic of the view was moved within the query - it
shouldn't have changed anything from a theoretical point of view but it
did.
Hopefully this gives you some options to look at.
Regards,
Mark.
"gmei"
<[EMAIL PROTECTED]> To: Multiple recipients of list
ORACLE-L <[EMAIL PROTECTED]>
Sent by: cc:
[EMAIL PROTECTED] Subject: sql query optimization
.com
11/06/2003 07:59
Please respond to
ORACLE-L
Hi:
I have been trying for two days to see if I could optimize this query
without much success. One of the programs here calls this query many many
times and I hope I could make it run faster. It typically take about 1 sec
to get the result. I have tried using "exists" to replace "in" and the
result is not good. All the columns involved in the "where" clause have
been
indexed. b1 and b2 are bind variables that are passed in.
----
select distinct observationlist.geneid, pval, score,
Decode(evidenceCode, 3000900, 'E', 3000902, 'P', 3000906),
proteomeRefID, Decode(ReferenceType, 'I', 'Y', 'N'), reftarget
from mt.dualblastresults, mt.seqtable querySeq,
isi.observationlist,
isi.termobs
where subjID = :b1
and queryID = QuerySeq.AASeqID
and querySeq.use='Y'
and querySeq.geneID=observationlist.geneid
and curationStatus='E'
and evidenceCode in (3000900,3000902,3000906)
and observationlist.id=obsID
and target='GeneID'
and termobs.termid in (select termid from isi.arc
where arctype in (2999999,3000000)
start with termid = :b2
connect by prior termid=parenttermid)
order by mt.blast.pvaltonumber(pval) asc, score desc, geneid,
decode(proteomerefid, null, 0, 1) desc;
--
This query typically returns 10 or less rows. mt.dualblastresults is a
view,
all others are tables. BTW, I need "distinct" and "order by" in the query.
Here is the explain plan and row counts in tables and their definition.
Anyone has any suggestions to make it run faster?
TIA.
Guang
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=715 Card=1 Bytes=124
)
1 0 SORT (ORDER BY) (Cost=715 Card=1 Bytes=124)
2 1 SORT (UNIQUE) (Cost=662 Card=1 Bytes=124)
3 2 NESTED LOOPS (Cost=609 Card=1 Bytes=124)
4 3 NESTED LOOPS (Cost=553 Card=1 Bytes=118)
5 4 NESTED LOOPS (Cost=550 Card=1 Bytes=106)
6 5 NESTED LOOPS (Cost=280 Card=30 Bytes=1830)
7 6 VIEW OF 'DUALBLASTRESULTS' (Cost=112 Card=168
Bytes=8232)
8 7 UNION-ALL
9 8 TABLE ACCESS (BY INDEX ROWID) OF 'BLASTRES
ULTS' (Cost=102 Card=118 Bytes=2360)
10 9 INDEX (RANGE SCAN) OF 'BLASTRESULTS_SUBJ
ID_INDEX' (NON-UNIQUE) (Cost=3 Card=118)
11 8 TABLE ACCESS (BY INDEX ROWID) OF 'BLASTRES
ULTS' (Cost=10 Card=50 Bytes=1000)
12 11 INDEX (RANGE SCAN) OF 'BLASTRESULTS_QUER
YID_INDEX' (NON-UNIQUE) (Cost=3 Card=50)
13 6 TABLE ACCESS (BY INDEX ROWID) OF 'SEQTABLE' (C
ost=1 Card=57344 Bytes=688128)
14 13 INDEX (UNIQUE SCAN) OF 'ST_ASI_UN' (UNIQUE)
15 5 TABLE ACCESS (BY INDEX ROWID) OF 'OBSERVATIONLIS
T' (Cost=9 Card=499 Bytes=22455)
16 15 INDEX (RANGE SCAN) OF 'OBSERVATIONLISTGENEID'
(NON-UNIQUE) (Cost=2 Card=499)
17 4 TABLE ACCESS (BY INDEX ROWID) OF 'TERMOBS' (Cost=3
Card=2388115 Bytes=28657380)
18 17 INDEX (RANGE SCAN) OF 'TERMOBSIDINDEX' (NON-UNIQ
UE) (Cost=2 Card=2388115)
19 3 VIEW OF 'VW_NSO_1' (Cost=56 Card=7 Bytes=42)
20 19 SORT (UNIQUE) (Cost=56 Card=7 Bytes=126)
21 20 FILTER
22 21 CONNECT BY
23 22 INDEX (RANGE SCAN) OF 'ARC_TERMID' (NON-UNIQ
UE) (Cost=1 Card=2 Bytes=12)
24 22 TABLE ACCESS (BY USER ROWID) OF 'ARC'
25 22 INDEX (RANGE SCAN) OF 'ARC_TYPETERMPARENT' (
UNIQUE) (Cost=3 Card=8 Bytes=144)
SQL> select count(*) from mt.dualblastresults;
COUNT(*)
----------
22332188
SQL> select count(*) from mt.seqtable ;
COUNT(*)
----------
373505
SQL> select count(*) from isi.observationlist;
COUNT(*)
----------
2290858
SQL> select count(*) from isi.termobs;
COUNT(*)
----------
2388115
SQL> select count(*) from isi.arc;
COUNT(*)
----------
207375
SQL> desc mt.dualblastresults
Name Null? Type
----------------------------------------- -------- -------------------
ID NUMBER
QUERYID NUMBER
SUBJID NUMBER
MATCHLEN NUMBER
IDENTITY NUMBER
POSITIVE NUMBER
GAP NUMBER
PVAL VARCHAR2(16)
SCORE NUMBER
QUERYSTART NUMBER
QUERYEND NUMBER
SUBJSTART NUMBER
SUBJEND NUMBER
CCOMMENT VARCHAR2(300)
BLASTDATE DATE
QFRAME NUMBER
SFRAME NUMBER
QUERYSPID NUMBER
SUBJSPID NUMBER
SQL> desc mt.seqtable ;
Name Null? Type
----------------------------------------- -------- --------------------
ID NOT NULL NUMBER
AASEQID NUMBER
DNASEQID NUMBER
GENEID NOT NULL NUMBER
USE CHAR(1)
ALTSPLICE VARCHAR2(128)
MUTANT VARCHAR2(128)
STRAIN VARCHAR2(128)
CDSSTRING VARCHAR2(2000)
VALID CHAR(1)
GENOMEPROJ CHAR(1)
CDNA CHAR(1)
PARTIALNTERM CHAR(1)
PARTIALCTERM CHAR(1)
TRANSID NUMBER
CCOMMENT VARCHAR2(300)
EST CHAR(1)
CLASS VARCHAR2(128)
SEQDATE DATE
CURID NUMBER
SQL> desc isi.observationlist;
Name Null? Type
----------------------------------------- -------- --------------------
ID NOT NULL NUMBER
GENEID NUMBER
CURATIONTYPE NUMBER
PROTEOMEREFID NUMBER
SOURCEID NUMBER
SOURCETABLE VARCHAR2(25)
DESTID NUMBER
DESTTABLE VARCHAR2(25)
DESTDATE DATE
REFERENCETYPE VARCHAR2(1)
EVIDENCECODE NUMBER
CURATORID NUMBER
EDITORID NUMBER
UPDATESTAMP DATE
CURATIONSTATUS VARCHAR2(1)
ORIGINALSTAMP DATE
NEXTOBS NUMBER
TARGET VARCHAR2(15)
REFTARGET VARCHAR2(15)
TOOL VARCHAR2(25)
OLDGENEID NUMBER
SQL> desc isi.termobs;
Name Null? Type
----------------------------------------- -------- -----------
OBSID NUMBER
TERMID NUMBER
CONTEXT NUMBER
SQL> desc isi.arc
Name Null? Type
----------------------------------------- -------- ---------------
TERMID NOT NULL NUMBER
PARENTTERMID NOT NULL NUMBER
OBSID NUMBER
ARCTYPE NOT NULL NUMBER
ARCID NUMBER
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: gmei
INET: [EMAIL PROTECTED]
Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Privileged/Confidential information may be contained in this message.
If you are not the addressee indicated in this message
(or responsible for delivery of the message to such person),
you may not copy or deliver this message to anyone.
In such case, you should destroy this message and kindly notify the sender
by reply e-mail or by telephone on (61 3) 9612-6999.
Please advise immediately if you or your employer does not consent to
Internet e-mail for messages of this kind.
Opinions, conclusions and other information in this message
that do not relate to the official business of
Transurban City Link Ltd
shall be understood as neither given nor endorsed by it.
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
--
Please see the official ORACLE-L FAQ: http://www.orafaq.net
--
Author: Mark Richard
INET: [EMAIL PROTECTED]
Fat City Network Services -- 858-538-5051 http://www.fatcity.com
San Diego, California -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).