Stephen Gutknecht wrote: > More learning about inner workings of the database engine and > performance > tuning. I've seen some behavior I can't understand when > studying EXPLAIN > output. > > Attached is an ASCII text file SQL1.txt has the CREATE TABLE, > INSERT, CREATE > INDEXES required for this sample. After the two tables are > populated, I > issue the following using SQL Studio: > > EXPLAIN > SELECT FA1, FA2, FA3, FAPayload1, FAPayload2, FAPayload3, > FB1, FBPayload1 > FROM TestA, TestB > WHERE FA3=10 > AND FA4=1 > AND TestA.FA2=TestB.FB1 > > My Questions: > > 1. Explain output shows 154 PAGECOUNT for the Join. Are > these 8KB MaxDB > storage pages? Does that mean 1232 kilobytes of Disk I/O?
The pagecount shown in the explain behind each table is the actual size of this table stored in the optimizerstatistics. Because you didn't run update stat on TESTB in your example the value in the optimizerstatistics is an initial value given at creation time of the table. After an update stat command for the table the explain shows a pagecount of 1. > 2. Is there a better (performing) way to write such a join? Union? > Subselect? I don't think so. The only way is to store all information in one table as you already suggested but I don't think that this is necessary because the strategy for this join looks ok to me. > 3. The SQL1.txt setup creates 6 indexes. For the EXPLAIN > output, it shows > that TESTAINDEX5 is being used. Would it not make more sense > for the MaxDB > engine to use TESTAINDEX6 - isn't that query an exact match > with the WHERE > clause? Isn't this causing more I/O than needed using a > INDEX that has a > unused field (FAPayload3 is on TESTAINDEX5 - but is not used > in the WHERE). No. In your test case the usage of index 5 or 6 result in the same costs because they have both a size of 1 page and for the access strategy only the fields FA3 and FA4 are used all additional fields are taken from the table data because an only index strategy is not possible because of the output list. For your production environment it may be better to create an additional index with only those two columns because the index is smaller. > In our production application. TableA has about 4 million records and > TableB has about 3.5 million records. The indexes and keys > illustrated are > the same type for our most frequent query. We know we can > denormalize the > two tables - but we also want to understand why there is so > much disk I/O on > this query - and why we do not understand the selection of INDEX. Could you send me a vtrace from the explain on your productiv server? Perhaps I can give any hint. Kind regards, Holger SAP Labs Berlin -- MaxDB Discussion Mailing List For list archives: http://lists.mysql.com/maxdb To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
