On Mon, 2006-09-11 at 06:20 -0700, Say42 wrote:
> I intend to play with some optimizer aspects. Just for fun.
Cool. If you think its fun (it is), you're half way there.
> I'm a
> novice in the DBMS development so I can not promise any available
> results but if it can be useful even as yet another failed attempt I
> will try.
This type of work is 90% analysis, 10% coding. You'll need to do a lot
of investigation, lots of discussion and listening.
> That's what I want to do:
> 1. Replace not very useful indexCorrelation with indexClustering.
An opinion such as "not very useful" isn't considered sufficient
explanation or justification for a change around here.
> 2. Consider caching of inner table in a nested loops join during
> estimation total cost of the join.
> More details:
> 1. During analyze we have sample rows. For every N-th sample row we can
> scan indices on qual like 'value >= index_first_column' and fetch first
> N row TIDs. To estimate count of fetched heap pages is not hard. To
> take the index clustering value just divide the pages count by the
> sample rows count.
> 2. It's more-more harder and may be impossible to me at all. The main
> - split page fetches cost and CPU cost into different variables and
> don't summarize it before join estimation.
> - final path cost estimation should be done in the join cost estimation
> and take into account number of inner table access (=K). CPU cost is
> directly proportionate to K but page fetches can be estimated by
> Mackert and Lohman formula using the total tuples count (K *
> inner_table_selectivity * inner_table_total_tuples).
I'd work on one thing at a time and go into it deeply.
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend