Stephen Gutknecht wrote:

> More learning about inner workings of the database engine and 
> performance
> tuning.  I've seen some behavior I can't understand when 
> studying EXPLAIN
> output.
> 
> Attached is an ASCII text file SQL1.txt has the CREATE TABLE, 
> INSERT, CREATE
> INDEXES required for this sample.  After the two tables are 
> populated, I
> issue the following using SQL Studio:
> 
> EXPLAIN
>  SELECT FA1, FA2, FA3, FAPayload1, FAPayload2, FAPayload3, 
> FB1, FBPayload1
>  FROM TestA, TestB
>  WHERE FA3=10
>  AND FA4=1
>  AND TestA.FA2=TestB.FB1
> 
> My Questions:
> 
> 1.  Explain output shows 154 PAGECOUNT for the Join.  Are 
> these  8KB MaxDB
> storage pages?  Does that mean 1232 kilobytes of Disk I/O?

The pagecount shown in the explain behind each table is the 
actual size of this table stored in the optimizerstatistics.
Because you didn't run update stat on TESTB in your example 
the value in the optimizerstatistics is an initial value given 
at creation time of the table.
After an update stat command for the table the explain shows a
pagecount of 1.

> 2.  Is there a better (performing) way to write such a join?  Union?
> Subselect?

I don't think so. The only way is to store all information 
in one table as you already suggested but I don't think that 
this is necessary because the strategy for this join looks 
ok to me.

> 3.  The SQL1.txt setup creates 6 indexes.  For the EXPLAIN 
> output, it shows
> that TESTAINDEX5 is being used.  Would it not make more sense 
> for the MaxDB
> engine to use TESTAINDEX6 - isn't that query an exact match 
> with the WHERE
> clause?  Isn't this causing more I/O than needed using a 
> INDEX that has a
> unused field (FAPayload3 is on TESTAINDEX5 - but is not used 
> in the WHERE).

No. In your test case the usage of index 5 or 6 result in the 
same costs because they have both a size of 1 page and for 
the access strategy only the fields FA3 and FA4 are used all 
additional fields are taken from the table data because an
only index strategy is not possible because of the output list.

For your production environment it may be better to create an 
additional index with only those two columns because the index 
is smaller.

> In our production application.  TableA has about 4 million records and
> TableB has about 3.5 million records.  The indexes and keys 
> illustrated are
> the same type for our most frequent query.  We know we can 
> denormalize the
> two tables - but we also want to understand why there is so 
> much disk I/O on
> this query - and why we do not understand the selection of INDEX.

Could you send me a vtrace from the explain on your productiv server?
Perhaps I can give any hint.

Kind regards,
Holger
SAP Labs Berlin

-- 
MaxDB Discussion Mailing List
For list archives: http://lists.mysql.com/maxdb
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to