Hi:

Thanks for your help and suggestions, Mark.

I have done some of the things you suggested already. Please see my text
below.

On Tue, 10 Jun 2003, Mark Richard wrote:

>
> This is an interesting (and relatively complex) query with what I think are
> several opportunities to tune it.  I'd probably spend some time looking at
> the following to see if they might help you out:
>
> 1)  Look at the sub-select with the connect by clause...  Try executing
> that query on it's own and get an idea of it's execution time and the
> number of rows returned for different bind variables.  Depending on the
> number of distinct values of arctype there may be some scope to optimise
> this component.  Possibly create a table containing on the arctype 2999999
> and 3000000 records and then remove this clause from the query - this could
> avoid accessing the table at all.  I have no idea if creating such a table
> is practical for your scenario though.

I just looked:

[EMAIL PROTECTED]> select count(*) from arc where arctype in (2999999,3000000);

  COUNT(*)
----------
     56932

This is about 27% of the total rows, so I will test to move them into a
new table tomorrow and this should help. I did test each part separatley
and timed them and I found that the sub-query is probably the bottle-neck because
"start ... connect by ..." requires walk the whole index to get all possible nodes
(expensive). I can create this new table.


> 2)  Consider a concatenated index (perhaps termid, parenttermid or
> parenttermid,termid - too early for my brain to remember without trying)
>

I don't know why concatenated index would help here, for which part in
where clause it would?

> 3)  Are the distinct and order by clausing really needed.  Often a distinct
> is included to hide a fault in the query (like a missing join or criteria)
> - distinct can be very expensive at times but since your query runs fairly
> fast you probably aren't removing many rows.  How many rows does the query
> return with versus without the distinct clause?
>

distinct is needed because otherwise it would return duplicates. It's more
like 5 rows with distinct and 12 without (typically).

> 4)  In the order by clause is "mt.blast.pvaltonumber(pval)"  This looks
> like a function call - if you have a way to avoid this function call you
> may see a performance increase.  You could test this by creating a table
> which stores the calculated result already and modify the query (remember
> to index and analyze the same as the original table).  Does this help?  Is
> it practical to store the result?  Again though, the benefit will be
> determined by the number of rows being ordered and the amount of query time
> spent doing this - for large data sets a function call is murder though.
>

Yes, it is a function call (from a package), it basically take care of the
situation that "pval" (for example) can be "2e-56" or "e-37", I need a way
of order them. I tried with  "decode" but found it did not improve the
performance.  I even tried to remove "order by" and the performance is
basically the same. This is probably because the number of rows returned
is small.

> 5)  Finally, I just realised at the last minute that DUALBLASTRESULTS
> appears to be a view.  Try bypassing the view and going straight to the
> base tables with the most restrictive criteria you have.  Sometimes Oracle
> doesn't handle views really well within queries.  I've seen improvements
> where the entire logic of the view was moved within the query - it
> shouldn't have changed anything from a theoretical point of view but it
> did.
>

I did try to replace the view with the base tables and the performance are the same. I
have a bounch of other queries using this view and I always get the same performance 
when
I replace it with base tables. So I think there probably isn't much I can do here.

Again, thanks for your help.

Guang

> Hopefully this gives you some options to look at.
>
> Regards,
>       Mark.
>
>
>
>
>                       "gmei"
>                       <[EMAIL PROTECTED]>        To:       Multiple recipients of 
> list ORACLE-L <[EMAIL PROTECTED]>
>                       Sent by:                 cc:
>                       [EMAIL PROTECTED]        Subject:  sql query optimization
>                       .com
>
>
>                       11/06/2003 07:59
>                       Please respond to
>                       ORACLE-L
>
>
>
>
>
>
> Hi:
>
> I have been trying for two days to see if I could optimize this query
> without much success. One of the programs here calls this query many many
> times and I hope I could make it run faster. It typically take about 1 sec
> to get the result. I have tried using "exists" to replace "in" and the
> result is not good. All the columns involved in the "where" clause have
> been
> indexed. b1 and b2 are bind variables that are passed in.
>
> ----
>
> select distinct observationlist.geneid, pval, score,
>              Decode(evidenceCode, 3000900, 'E', 3000902, 'P', 3000906),
>              proteomeRefID, Decode(ReferenceType, 'I', 'Y', 'N'), reftarget
> from         mt.dualblastresults, mt.seqtable querySeq,
> isi.observationlist,
> isi.termobs
> where        subjID = :b1
> and   queryID = QuerySeq.AASeqID
> and          querySeq.use='Y'
> and   querySeq.geneID=observationlist.geneid
> and          curationStatus='E'
> and   evidenceCode in (3000900,3000902,3000906)
> and          observationlist.id=obsID
> and   target='GeneID'
> and          termobs.termid in (select termid from isi.arc
>                          where arctype in (2999999,3000000)
>                          start with termid = :b2
>                          connect by prior termid=parenttermid)
> order by mt.blast.pvaltonumber(pval) asc, score desc, geneid,
>          decode(proteomerefid, null, 0, 1) desc;
>
> --
>
> This query typically returns 10 or less rows. mt.dualblastresults is a
> view,
> all others are tables. BTW, I need "distinct" and "order by" in the query.
>
> Here is the explain plan and row counts in tables and their definition.
> Anyone has any suggestions to make it run faster?
>
> TIA.
>
> Guang
>
>
> Execution Plan
> ----------------------------------------------------------
>    0      SELECT STATEMENT Optimizer=CHOOSE (Cost=715 Card=1 Bytes=124
>           )
>
>    1    0   SORT (ORDER BY) (Cost=715 Card=1 Bytes=124)
>    2    1     SORT (UNIQUE) (Cost=662 Card=1 Bytes=124)
>    3    2       NESTED LOOPS (Cost=609 Card=1 Bytes=124)
>    4    3         NESTED LOOPS (Cost=553 Card=1 Bytes=118)
>    5    4           NESTED LOOPS (Cost=550 Card=1 Bytes=106)
>    6    5             NESTED LOOPS (Cost=280 Card=30 Bytes=1830)
>    7    6               VIEW OF 'DUALBLASTRESULTS' (Cost=112 Card=168
>           Bytes=8232)
>
>    8    7                 UNION-ALL
>    9    8                   TABLE ACCESS (BY INDEX ROWID) OF 'BLASTRES
>           ULTS' (Cost=102 Card=118 Bytes=2360)
>
>   10    9                     INDEX (RANGE SCAN) OF 'BLASTRESULTS_SUBJ
>           ID_INDEX' (NON-UNIQUE) (Cost=3 Card=118)
>
>   11    8                   TABLE ACCESS (BY INDEX ROWID) OF 'BLASTRES
>           ULTS' (Cost=10 Card=50 Bytes=1000)
>
>   12   11                     INDEX (RANGE SCAN) OF 'BLASTRESULTS_QUER
>           YID_INDEX' (NON-UNIQUE) (Cost=3 Card=50)
>
>   13    6               TABLE ACCESS (BY INDEX ROWID) OF 'SEQTABLE' (C
>           ost=1 Card=57344 Bytes=688128)
>
>   14   13                 INDEX (UNIQUE SCAN) OF 'ST_ASI_UN' (UNIQUE)
>   15    5             TABLE ACCESS (BY INDEX ROWID) OF 'OBSERVATIONLIS
>           T' (Cost=9 Card=499 Bytes=22455)
>
>   16   15               INDEX (RANGE SCAN) OF 'OBSERVATIONLISTGENEID'
>           (NON-UNIQUE) (Cost=2 Card=499)
>
>   17    4           TABLE ACCESS (BY INDEX ROWID) OF 'TERMOBS' (Cost=3
>            Card=2388115 Bytes=28657380)
>
>   18   17             INDEX (RANGE SCAN) OF 'TERMOBSIDINDEX' (NON-UNIQ
>           UE) (Cost=2 Card=2388115)
>
>   19    3         VIEW OF 'VW_NSO_1' (Cost=56 Card=7 Bytes=42)
>   20   19           SORT (UNIQUE) (Cost=56 Card=7 Bytes=126)
>   21   20             FILTER
>   22   21               CONNECT BY
>   23   22                 INDEX (RANGE SCAN) OF 'ARC_TERMID' (NON-UNIQ
>           UE) (Cost=1 Card=2 Bytes=12)
>
>   24   22                 TABLE ACCESS (BY USER ROWID) OF 'ARC'
>   25   22                 INDEX (RANGE SCAN) OF 'ARC_TYPETERMPARENT' (
>           UNIQUE) (Cost=3 Card=8 Bytes=144)
>
>
>
> SQL> select count(*) from mt.dualblastresults;
>
>   COUNT(*)
> ----------
>   22332188
>
> SQL> select count(*) from mt.seqtable ;
>
>   COUNT(*)
> ----------
>     373505
>
> SQL> select count(*) from isi.observationlist;
>
>   COUNT(*)
> ----------
>    2290858
>
> SQL> select count(*) from isi.termobs;
>
>   COUNT(*)
> ----------
>    2388115
>
> SQL> select count(*) from isi.arc;
>
>   COUNT(*)
> ----------
>     207375
>
> SQL> desc mt.dualblastresults
>  Name                                      Null?    Type
>  ----------------------------------------- -------- -------------------
>  ID                                                 NUMBER
>  QUERYID                                            NUMBER
>  SUBJID                                             NUMBER
>  MATCHLEN                                           NUMBER
>  IDENTITY                                           NUMBER
>  POSITIVE                                           NUMBER
>  GAP                                                NUMBER
>  PVAL                                               VARCHAR2(16)
>  SCORE                                              NUMBER
>  QUERYSTART                                         NUMBER
>  QUERYEND                                           NUMBER
>  SUBJSTART                                          NUMBER
>  SUBJEND                                            NUMBER
>  CCOMMENT                                           VARCHAR2(300)
>  BLASTDATE                                          DATE
>  QFRAME                                             NUMBER
>  SFRAME                                             NUMBER
>  QUERYSPID                                          NUMBER
>  SUBJSPID                                           NUMBER
>
> SQL> desc mt.seqtable ;
>  Name                                      Null?    Type
>  ----------------------------------------- -------- --------------------
>  ID                                        NOT NULL NUMBER
>  AASEQID                                            NUMBER
>  DNASEQID                                           NUMBER
>  GENEID                                    NOT NULL NUMBER
>  USE                                                CHAR(1)
>  ALTSPLICE                                          VARCHAR2(128)
>  MUTANT                                             VARCHAR2(128)
>  STRAIN                                             VARCHAR2(128)
>  CDSSTRING                                          VARCHAR2(2000)
>  VALID                                              CHAR(1)
>  GENOMEPROJ                                         CHAR(1)
>  CDNA                                               CHAR(1)
>  PARTIALNTERM                                       CHAR(1)
>  PARTIALCTERM                                       CHAR(1)
>  TRANSID                                            NUMBER
>  CCOMMENT                                           VARCHAR2(300)
>  EST                                                CHAR(1)
>  CLASS                                              VARCHAR2(128)
>  SEQDATE                                            DATE
>  CURID                                              NUMBER
>
> SQL> desc isi.observationlist;
>  Name                                      Null?    Type
>  ----------------------------------------- -------- --------------------
>  ID                                        NOT NULL NUMBER
>  GENEID                                             NUMBER
>  CURATIONTYPE                                       NUMBER
>  PROTEOMEREFID                                      NUMBER
>  SOURCEID                                           NUMBER
>  SOURCETABLE                                        VARCHAR2(25)
>  DESTID                                             NUMBER
>  DESTTABLE                                          VARCHAR2(25)
>  DESTDATE                                           DATE
>  REFERENCETYPE                                      VARCHAR2(1)
>  EVIDENCECODE                                       NUMBER
>  CURATORID                                          NUMBER
>  EDITORID                                           NUMBER
>  UPDATESTAMP                                        DATE
>  CURATIONSTATUS                                     VARCHAR2(1)
>  ORIGINALSTAMP                                      DATE
>  NEXTOBS                                            NUMBER
>  TARGET                                             VARCHAR2(15)
>  REFTARGET                                          VARCHAR2(15)
>  TOOL                                               VARCHAR2(25)
>  OLDGENEID                                          NUMBER
>
> SQL> desc isi.termobs;
>  Name                                      Null?    Type
>  ----------------------------------------- -------- -----------
>  OBSID                                              NUMBER
>  TERMID                                             NUMBER
>  CONTEXT                                            NUMBER
>
> SQL> desc isi.arc
>  Name                                      Null?    Type
>  ----------------------------------------- -------- ---------------
>  TERMID                                    NOT NULL NUMBER
>  PARENTTERMID                              NOT NULL NUMBER
>  OBSID                                              NUMBER
>  ARCTYPE                                   NOT NULL NUMBER
>  ARCID                                              NUMBER
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: gmei
>   INET: [EMAIL PROTECTED]
>
> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
> San Diego, California        -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from).  You may
> also send the HELP command for other information (like subscribing).
>
>
>
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>    Privileged/Confidential information may be contained in this message.
>           If you are not the addressee indicated in this message
>        (or responsible for delivery of the message to such person),
>             you may not copy or deliver this message to anyone.
> In such case, you should destroy this message and kindly notify the sender
>            by reply e-mail or by telephone on (61 3) 9612-6999.
>    Please advise immediately if you or your employer does not consent to
>                 Internet e-mail for messages of this kind.
>         Opinions, conclusions and other information in this message
>               that do not relate to the official business of
>                          Transurban City Link Ltd
>          shall be understood as neither given nor endorsed by it.
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<---->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
>
> --
> Please see the official ORACLE-L FAQ: http://www.orafaq.net
> --
> Author: Mark Richard
>   INET: [EMAIL PROTECTED]
>
> Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
> San Diego, California        -- Mailing list and web hosting services
> ---------------------------------------------------------------------
> To REMOVE yourself from this mailing list, send an E-Mail message
> to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
> the message BODY, include a line containing: UNSUB ORACLE-L
> (or the name of mailing list you want to be removed from).  You may
> also send the HELP command for other information (like subscribing).
>

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Guang Mei
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to