AW: AW: Query optimization problem

Becker, Holger Thu, 03 Feb 2005 09:25:51 -0800

Euke Castellano wrote:

> Gesendet: Donnerstag, 3. Februar 2005 16:00
> An: [email protected]
> Cc: Becker, Holger
> Betreff: Re: AW: Query optimization problem
> 
> 
> Becker, Holger wrote:
> 
> >Euke Castellano wrote:
> >  
> >
> >>>I have the following table definition:
> >>>
> >>>CREATE TABLE "CONSUM"(
> >>>   "ID" Integer NOT NULL,
> >>>   "HOTEL" Integer NOT NULL,
> >>>   "SERVICE" Integer NOT NULL,
> >>>   "SEGMENT" Integer NOT NULL,
> >>>   "GUEST" Integer,
> >>>   "COMPANY" Fixed (38,0),
> >>>   "AGENCY" Fixed (38,0),
> >>>   "REPRESENTATIVE" Fixed (38,0),
> >>>   "INVOICEDATE" Date,
> >>>   "CHARGEDATE" Date,
> >>>   "QUANTITY" Fixed (12,2) DEFAULT 0.00,
> >>>   "AMOUNT" Fixed (12,2) DEFAULT 0.00,
> >>>   "ROOMNIGHTS" Fixed (12,2) DEFAULT 0.00,
> >>>   "RATE" Integer,
> >>>   "SEASON" Char (1) ASCII,
> >>>   "PROCESSDATE" Date,
> >>>   PRIMARY KEY ("ID")
> >>>)
> >>>
> >>>I also have an UNIQUE INDEX created on column INVOICEDATE.
> >>>The table has ~25.000.000 rows.
> >>>
> >>>When I try a query like:    SELECT * FROM consum WHERE invoiceDate 
> >>>between '2004-01-01' AND '2004-12-31'
> >>>it takes very few seconds to show the information using 
> >>>      
> >>>
> >>SQLStudio, but 
> >>    
> >>
> >>>if I try this other query:
> >>>   SELECT * FROM consum WHERE invoiceDate between '2004-01-01' AND 
> >>>'2004-12-31' AND rate=9
> >>>it take several hours!! to process the information.
> >>>
> >>>What can I do to optimize this kind of queries? Should I create an 
> >>>index for each column that I need to query? Or is it better 
> >>>      
> >>>
> >>to create 
> >>    
> >>
> >>>an index that includes all the columns of the query?
> >>>
> >>>
> >>>Thank you very much for your help and sorry for my english.
> >>>
> >>>
> >>>
> >>>      
> >>>
> >>The index is not UNIQUE.
> >>Sorry and thanks.
> >>    
> >>
> >
> >Hi,
> >
> >did you see the whole result in SQL Studio for the first select 
> >or is it possible that you only have a look on the first n rows?
> >
> >SQL Studio only fetches those rows which are requested by the user
> >when he scrolls through the result set.
> >
> >So I suppose that you only gets the first rows very fast because 
> >many or all rows are in the range you asked for and it would last 
> >much longer to scroll through the whole result.
> >
> >At your second query you looks for rows that have rate=9 
> >and if only few rows fulfil this condition it last much longer
> >to find the first n rows.
> >
> >You could speed up your query with a multiple index over invoiceDate 
> >and rate: "create index i2 on consum (invoiceDate,rate)"
> >
> >Kind regards
> >Holger
> >
> >
> >
> >
> >  
> >
> Thanks Holger for your answer:
> 
> You're right on your appreciation. I perfectly understand 
> you're explain 
> but I consider the performance of this statement is not good 
> enough for 
> my application. I do a simple program via JDBC in order to test this 
> statement:
> 
> ********* BEGIN JAVA
> {.......}
> public class TestConsum {
> 
> {.......}
>     public static void main(String[] args) {.......}{
>         Connection cn = DriverManager.getConnection(url, 
> user, passwd);
>         String sel =
>             "SELECT id,hotel,service,segment,rate,invoiceDate FROM 
> consum " +
>             " WHERE invoiceDate BETWEEN '2004-01-01' AND '2004-01-31' 
> AND rate=23";
>         PreparedStatement stmt = cn.prepareStatement(sel,
>                 ResultSet.TYPE_FORWARD_ONLY, 
> ResultSet.CONCUR_READ_ONLY);
>         stmt.setFetchDirection(ResultSet.FETCH_FORWARD);
>         System.out.println( "BEGIN ..: " + new Date());
>         ResultSet rs = stmt.executeQuery();
>         for (int rowCounter=0;rs.next();rowCounter++){
>             if (rowCounter % 5000 == 0) {
>                 System.out.println( rowCounter + " --> " + 
> new Date());
>             }
>         }
>         System.out.println( "END ..: " + new Date());
>         rs.close();
>         stmt.close();
>         cn.close();
>     }
> }
> ********* END JAVA
> 
> As you see, no extra operations are made. The output:
> 
> ********** BEGIN TRACE
> BEGIN  ..: Wed Feb 02 18:18:24 CET 2005
> 0      --> Wed Feb 02 18:19:01 CET 2005
> 5000   --> Wed Feb 02 18:21:43 CET 2005
> 10000  --> Wed Feb 02 18:24:02 CET 2005
> 15000  --> Wed Feb 02 18:26:06 CET 2005
> 20000  --> Wed Feb 02 18:27:54 CET 2005
> 25000  --> Wed Feb 02 18:28:51 CET 2005
> 30000  --> Wed Feb 02 18:30:36 CET 2005
> 35000  --> Wed Feb 02 18:32:57 CET 2005
> 40000  --> Wed Feb 02 18:34:38 CET 2005
> 45000  --> Wed Feb 02 18:36:00 CET 2005
> 50000  --> Wed Feb 02 18:37:37 CET 2005
> 55000  --> Wed Feb 02 18:38:45 CET 2005
> 60000  --> Wed Feb 02 18:39:51 CET 2005
> 65000  --> Wed Feb 02 18:40:51 CET 2005
> 70000  --> Wed Feb 02 18:42:04 CET 2005
> 75000  --> Wed Feb 02 18:44:20 CET 2005
> 80000  --> Wed Feb 02 18:46:43 CET 2005
> 85000  --> Wed Feb 02 18:48:44 CET 2005
> 90000  --> Wed Feb 02 18:50:04 CET 2005
> 95000  --> Wed Feb 02 18:51:23 CET 2005
> 100000 --> Wed Feb 02 18:52:40 CET 2005
> 105000 --> Wed Feb 02 18:53:34 CET 2005
> 110000 --> Wed Feb 02 18:54:33 CET 2005
> 115000 --> Wed Feb 02 18:55:19 CET 2005
> 120000 --> Wed Feb 02 18:56:45 CET 2005
> 125000 --> Wed Feb 02 18:58:59 CET 2005
> 130000 --> Wed Feb 02 19:01:13 CET 2005
> 135000 --> Wed Feb 02 19:02:56 CET 2005
> 140000 --> Wed Feb 02 19:04:13 CET 2005
> 145000 --> Wed Feb 02 19:05:39 CET 2005
> 150000 --> Wed Feb 02 19:06:43 CET 2005
> 155000 --> Wed Feb 02 19:07:55 CET 2005
> 160000 --> Wed Feb 02 19:09:16 CET 2005
> 165000 --> Wed Feb 02 19:10:18 CET 2005
> 170000 --> Wed Feb 02 19:11:37 CET 2005
> END    ..: Wed Feb 02 19:12:26 CET 2005
> ********** END TRACE
> 
> �Do you think is normal the process takes almost an hour to scan 
> ~170,000 rows?
> 
> Sorry for my english, if something is not understood, please ask me.
> Thank you very much.
> 
> Euke.


Hi,

the database scans not only the ~170000 rows in an hour it have to scan 
the ~1000000 rows which fulfils the condition of the indexed column invoicedate.

But now I'm wondering why the select with the rate=9 condition last several 
hours
as you mentiond in your other mails and doesn't finished in the same time like 
this 
command.

Are you the only one who is active on this database. Or is it possible that 
your select 
waits on another transaction which changed some rows and doesn't commit them?

Would you mind repeating your jdbc test with the condition rate=9 and post the 
output here.

But never the less I would suggest to create an additional index on rate or a 
combined 
index on rate and invoicedate.

Kind regards
Holger


--
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

AW: AW: Query optimization problem

Reply via email to