[HACKERS] Proposal: Pluggable Optimizer Interface
Hi All, Tomas Kovarik and I have presented at PGCon 2007 in Ottawa the ideas about other possible optimizer algorithms to be used in PostgreSQL. We are quite new to PostgreSQL project so it took us some time to go through the sources end explore the possibilities how things could be implemented. There is a proposal attached to this mail about the interface we would like to implement for switching between different optimizers. Please review it and provide a feedback to us. Thank You. Regards Julius Stroffek Proposal for Pluggable Optimizer Interface == Overview We have presented at PGCon 2007 in Ottawa couple of other approaches and algorithms that can be used for query optimization, see http://www.pgcon.org/2008/papers/Execution_Plan_Optimization_Techniques_Julius_Stroffek.pdf We have focused on algorithms for searching the space of all possible orders of joins including bushy trees. The possible algorithms include * Dynamic Programming (already implemented in PostgreSQL) * Genetic Algorithm (already implemented in PostgreSQL) * Dijkstra's Algorithm * A* Search Algorithm * Greedy âNearest Neighborâ Algorithm * Hill-Climbing * Simulated Annealing * Iterative Improvement * Two Phase Optimization * Toured Simulated Annealing Choosing the best algorithm from the above list is difficult. We have to consider the length of the optimizer computation vs. the quality of the solution which we would like to achieve. We may want to do some hybrid optimization â run couple of the algorithms from the above and choose the best solution found. We might run some very fast algorithm at the beginning and depending on the solution cost we may decide whether it is worthwhile to try to optimize the plan even more (using other algorithm with longer running time but producing better solutions). Therefore we would like to propose an interface which can be used to switch between different optimizer algorithms and/or allow a user to write and use his own implementation. It would allow the community to ship more optimizer algorithms as contrib modules and users may then decide which of those algorithms should be used for their queries. Creating an optimizer = We would propose to create a catalog holding the available optimizers in the system called pg_optimizer. Users could than use a SET command to switch between different optimizers. postgres=# select * from pg_optimizer; optname | optproc -+--- geqo| geqo_optimizer dynamic | dynamic_optimizer greedy | greedy_optimizer (4 rows) postgres=# set optimizer=greedy; SET Optimizer Invocation Point == There is a code in function make_rel_from_joinlist which decides whether to invoke dynamic programming or genetic algorithm for query optimization. We would propose to place the invocation of the plugged optimizer to the same place and with the same parameters as function geqo and make_one_rel_by_joins are currently invoked. Creating and dropping an optimizer == The optimizer function have to be implemented as a C-Language Function using âVersion 1 Calling Conventionsâ. The return type of the function is RelOptInfo * and the arguments passed to the function are 1.PlannerInfo *root 2.int levels_needed 3.List * initial_rels The proper âCREATE FUNCTIONâ statement have to be used to create the optimizer function. CREATE FUNCTION greedyoptimizer(internal, int, internal) RETURNS internal AS 'mylib', 'greedy_optimizer' LANGUAGE C ; Once, the optimizer function is created user may create an optimizer using the function with the statement CREATE OPTIMIZER greedy ( function = greedyoptimizer comment = 'Greedy Nearest Neighbor Optimizer' ); If the user decides not to use the optimizer anymore he can invoke DROP OPTIMIZER greedy; User have to also drop the optimizer function with DROP FUNCTION greedyoptimizer; Project TODO List = 1.Create a pg_optimizer catalog to hold available optimizers. 2.Create wrappers above the current dynamic programming and genetic algorithm optimizers to be used to call those implementations. 3.Modify the parser and add the functions to handle and execute the CREATE/DROP OPTIMIZER statements. 4.Modify GUC that it would be possible to switch between optimizers. 5.Change the code at the optimizer invocation point that the appropriate optimizer function would be called. 6.Handle object dependencies â make an entry in pg_depend that optimizer depends on its function. 7.Implement '\dO' command that will list the available optimizers. 8.Create a contrib module and ship some other optimizer algorithms. 9.Any other suggestion, comments and changes that will come out from the review of this proposal. Things to Decide 1.Rights. Who can create/drop optimizers? Who can use
Re: [HACKERS] Proposal: Pluggable Optimizer Interface
Julius Stroffek wrote: Hi All, Tomas Kovarik and I have presented at PGCon 2007 in Ottawa the ideas about other possible optimizer algorithms to be used in PostgreSQL. We are quite new to PostgreSQL project so it took us some time to go through the sources end explore the possibilities how things could be implemented. There is a proposal attached to this mail about the interface we would like to implement for switching between different optimizers. Please review it and provide a feedback to us. Thank You. hmm - how does is that proposal different from what got implemented with: http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php Stefan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Pluggable Optimizer Interface
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Julius Stroffek wrote: There is a proposal attached to this mail about the interface we would like to implement for switching between different optimizers. Please review it and provide a feedback to us. hmm - how does is that proposal different from what got implemented with: http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php Well, it's a very different level of abstraction. The planner_hook would allow you to replace the *entire* planner, but if you only want to replace GEQO (that is, only substitute some other heuristics for partial search of a large join-order space), doing it from planner_hook will probably require duplicating a great deal of code. A hook right at the place where we currently choose geqo or regular would be a lot easier to experiment with. Replacing GEQO sounds like a fine area for investigation to me; I've always been dubious about whether it's doing a good job. But I'd prefer a simple hook function pointer designed in the same style as planner_hook (ie, intended to be overridden by a loadable module). The proposed addition of a system catalog and SQL-level management commands sounds like a great way to waste a lot of effort on mere decoration, before ever getting to the point of being able to demonstrate that there's any value in it. Also, while we might accept a small hook-function patch for 8.3, there's zero chance of any of that other stuff making it into this release cycle. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: Pluggable Optimizer Interface
Stefan, thanks for pointing this out. I missed this change. We would like to place the hooks to a different place in the planner and we would like to just replace the non-deterministic algorithm searching for the best order of joins and keep the rest of the planner untouched. I am not quite sure about the usage from the user point of view of what got implemented. I read just the code of the patch. Are there more explanations somewhere else? I understood that if the user creates his own implementation of the planner which can be stored in some external library, he have to provide some C language function as a hook activator which will assign the desired value to the planner_hook variable. Both, the activator function and the new planner implementation have to be located in the same dynamic library which will be loaded when CREATE FUNCTION statement would be used on hook activator function. Am I correct? Have I missed something? If the above is the case than it is exactly what we wanted except we would like to have the hook also in the different place. There are more things in the proposal as a new pg_optimizer catalog and different way of configuring the hooks. However, this thinks are not mandatory for the functionality but are more user friendly. Thanks Julo Stefan Kaltenbrunner wrote: Julius Stroffek wrote: Hi All, Tomas Kovarik and I have presented at PGCon 2007 in Ottawa the ideas about other possible optimizer algorithms to be used in PostgreSQL. We are quite new to PostgreSQL project so it took us some time to go through the sources end explore the possibilities how things could be implemented. There is a proposal attached to this mail about the interface we would like to implement for switching between different optimizers. Please review it and provide a feedback to us. Thank You. hmm - how does is that proposal different from what got implemented with: http://archives.postgresql.org/pgsql-committers/2007-05/msg00315.php Stefan
Re: [HACKERS] Proposal: Pluggable Optimizer Interface
Julius Stroffek [EMAIL PROTECTED] writes: I understood that if the user creates his own implementation of the planner which can be stored in some external library, he have to provide some C language function as a hook activator which will assign the desired value to the planner_hook variable. Both, the activator function and the new planner implementation have to be located in the same dynamic library which will be loaded when CREATE FUNCTION statement would be used on hook activator function. You could do it that way if you wanted, but a minimalistic solution is just to install the hook from the _PG_init function of a loadable library, and then LOAD is sufficient for a user to execute the thing. There's a small example at http://archives.postgresql.org/pgsql-patches/2007-05/msg00421.php Also, having the loadable module add a custom GUC variable would likely be a preferable solution for control purposes than making specialized functions. I attach another small hack I made recently, which simply scales all the planner's relation size estimates by a scale_factor GUC; this is handy for investigating how a plan will change with relation size, without having to actually create gigabytes of test data. There are more things in the proposal as a new pg_optimizer catalog and different way of configuring the hooks. However, this thinks are not mandatory for the functionality but are more user friendly. Granted, but at this point we are talking about infrastructure for planner-hackers to play with, not something that's intended to be a long-term API for end users. It may or may not happen that we ever need a user API for this at all. I think a planner that just does the right thing is far preferable to one with a lot of knobs that users have to know how to twiddle, so I see this more as scaffolding on which someone can build and test the replacement for GEQO; which ultimately would go in without any user-visible API additions. regards, tom lane #include postgres.h #include fmgr.h #include commands/explain.h #include optimizer/plancat.h #include optimizer/planner.h #include utils/guc.h PG_MODULE_MAGIC; void_PG_init(void); void_PG_fini(void); static double scale_factor = 1.0; static void my_get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, RelOptInfo *rel); /* * Get control during planner's get_relation_info() function, which sets up * a RelOptInfo struct based on the system catalog contents. We can modify * the struct contents to cause the planner to work with a hypothetical * situation rather than what's actually in the catalogs. * * This simplistic example just scales all relation size estimates by a * user-settable factor. */ static void my_get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent, RelOptInfo *rel) { ListCell *ilist; /* Do nothing for an inheritance parent RelOptInfo */ if (inhparent) return; rel-pages = (BlockNumber) ceil(rel-pages * scale_factor); rel-tuples = ceil(rel-tuples * scale_factor); foreach(ilist, rel-indexlist) { IndexOptInfo *ind = (IndexOptInfo *) lfirst(ilist); ind-pages = (BlockNumber) ceil(ind-pages * scale_factor); ind-tuples = ceil(ind-tuples * scale_factor); } } /* * _pg_init() - library load-time initialization * * DO NOT make this static nor change its name! */ void _PG_init(void) { /* Get into the hooks we need to be in all the time */ get_relation_info_hook = my_get_relation_info; /* Make scale_factor accessible through GUC */ DefineCustomRealVariable(scale_factor, , , scale_factor, 0.0001, 1e9, PGC_USERSET, NULL, NULL); } /* * _PG_fini() - library unload-time finalization * * DO NOT make this static nor change its name! */ void _PG_fini(void) { /* Get out of all the hooks (just to be sure) */ get_relation_info_hook = NULL; } ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface
Tom, Also, while we might accept a small hook-function patch for 8.3, there's zero chance of any of that other stuff making it into this release cycle. I don't think anyone was thinking about 8.3. This is pretty much 8.4 stuff; Julius is just raising it now becuase they don't want to go down the wrong path and waste everyone's time. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] [HACKERS] Proposal: Pluggable Optimizer Interface
Josh Berkus [EMAIL PROTECTED] writes: Tom, Also, while we might accept a small hook-function patch for 8.3, there's zero chance of any of that other stuff making it into this release cycle. I don't think anyone was thinking about 8.3. This is pretty much 8.4 stuff; Julius is just raising it now becuase they don't want to go down the wrong path and waste everyone's time. Well, if they get the hook in now, then in six months or so when they have something to play with, people would be able to play with it. If not, there'll be zero uptake till after 8.4 is released... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster