Hi All,
Tomas Kovarik and I have presented at PGCon 2007 in Ottawa
the ideas about other possible optimizer algorithms to be used
in PostgreSQL.
We are quite new to PostgreSQL project so it took us some
time to go through the sources end explore the possibilities
how things could be implemented.
There is a proposal attached to this mail about the interface
we would like to implement for switching between different
optimizers. Please review it and provide a feedback to us.
Thank You.
Regards
Julius Stroffek
Proposal for Pluggable Optimizer Interface
==========================================
Overview
========
We have presented at PGCon 2007 in Ottawa couple of other
approaches and algorithms that can be used for query optimization,
see
http://www.pgcon.org/2008/papers/Execution_Plan_Optimization_Techniques_Julius_Stroffek.pdf
We have focused on algorithms for searching the space of
all possible orders of joins including bushy trees.
The possible algorithms include
* Dynamic Programming (already implemented
in PostgreSQL)
* Genetic Algorithm (already implemented
in PostgreSQL)
* Dijkstra's Algorithm
* A* Search Algorithm
* Greedy âNearest Neighborâ Algorithm
* Hill-Climbing
* Simulated Annealing
* Iterative Improvement
* Two Phase Optimization
* Toured Simulated Annealing
Choosing the best algorithm from the above list is
difficult. We have to consider the length of the optimizer
computation vs. the quality of the solution which we would
like to achieve. We may want to do some hybrid optimization
â run couple of the algorithms from the above and choose
the best solution found. We might run some very fast
algorithm at the beginning and depending on the solution
cost we may decide whether it is worthwhile to try
to optimize the plan even more (using other algorithm
with longer running time but producing better solutions).
Therefore we would like to propose an interface which
can be used to switch between different optimizer algorithms
and/or allow a user to write and use his own implementation.
It would allow the community to ship more optimizer
algorithms as contrib modules and users may then decide
which of those algorithms should be used for their queries.
Creating an optimizer
=====================
We would propose to create a catalog holding the available
optimizers in the system called pg_optimizer. Users could
than use a SET command to switch between different
optimizers.
postgres=# select * from pg_optimizer;
optname | optproc
---------+-------------------
geqo | geqo_optimizer
dynamic | dynamic_optimizer
greedy | greedy_optimizer
(4 rows)
postgres=# set optimizer=greedy;
SET
Optimizer Invocation Point
==========================
There is a code in function make_rel_from_joinlist which
decides whether to invoke dynamic programming or genetic
algorithm for query optimization. We would propose to place
the invocation of the plugged optimizer to the same place
and with the same parameters as function geqo and
make_one_rel_by_joins are currently invoked.
Creating and dropping an optimizer
==================================
The optimizer function have to be implemented as a C-Language
Function using âVersion 1 Calling Conventionsâ. The return
type of the function is RelOptInfo * and the arguments
passed to the function are
1.PlannerInfo *root
2.int levels_needed
3.List * initial_rels
The proper âCREATE FUNCTIONâ statement have to be used
to create the optimizer function.
> CREATE FUNCTION greedyoptimizer(internal, int, internal)
> RETURNS internal
> AS 'mylib', 'greedy_optimizer'
> LANGUAGE C
> ;
Once, the optimizer function is created user may create
an optimizer using the function with the statement
> CREATE OPTIMIZER greedy (
> function = greedyoptimizer
> comment = 'Greedy Nearest Neighbor Optimizer'
> );
If the user decides not to use the optimizer anymore
he can invoke
> DROP OPTIMIZER greedy;
User have to also drop the optimizer function with
> DROP FUNCTION greedyoptimizer;
Project TODO List
=================
1.Create a pg_optimizer catalog to hold available
optimizers.
2.Create wrappers above the current dynamic
programming and genetic algorithm optimizers
to be used to call those implementations.
3.Modify the parser and add the functions to handle
and execute the CREATE/DROP OPTIMIZER statements.
4.Modify GUC that it would be possible to switch
between optimizers.
5.Change the code at the optimizer invocation
point that the appropriate optimizer function
would be called.
6.Handle object dependencies â make an entry
in pg_depend that optimizer depends on its
function.
7.Implement '\dO' command that will list the
available optimizers.
8.Create a contrib module and ship some other
optimizer algorithms.
9.Any other suggestion, comments and changes
that will come out from the review of this
proposal.
Things to Decide
================
1.Rights. Who can create/drop optimizers?
Who can use the optimizer? Are the rights
for optimizer function enough? Probably,
creating and dropping optimizer could be
allowed just for DB administrators and
rights for optimizer functions are enough
to restrict the usage. No need to introduce
âGRANT ON OPTIMIZERâ.
2.How to behave when dropping an optimizer
which is currently running or is set up
in other sessions.
3.Any other suggestion, comments and changes
that will come out from the review of this
proposal.
Future Possible Improvements
============================
* Implement more algorithms and implement also
more complex combinations of the algorithms.
* Create even more optimizer invocation points.
Different implementations might be invoked in
different places of the current code. If this
would make sense and it will come up as
a requirement for some optimizer algorithm.
* Do some work on âbenchmarkingâ the algorithms
depending on the cost of the plan found,
running time, etc.
* Consider some heuristic for choosing the different
optimizers for different queries.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match