Re: [HACKERS] TwoPO: experimental join order algorithm

2010-08-05 Thread Adriano Lange
On Fri, Jul 30, 2010 at 7:02 AM, Jan Urbański wulc...@wulczer.org wrote:
 On 24/07/10 15:20, Adriano Lange wrote:

 Hi,

 Hi!


 I'd like to release the last version of my experimental join order
 algorithm (TwoPO - Two Phase Optimization [1]):

 http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

 This algorithm is not production-ready, but an experimental set of
 ideas, which need to be refined and evaluated. As the join order
 optimization is a hard problem, the evaluation of a search strategy is
 also a hard task. Therefore, I think the most important TODO item
 related to replacement of GEQO algorithm is to define a set of
 evaluation criteria considered as relevant.

 Good to hear from you --  I don't know if you are aware about a simulated
 annealing join search module that I'm working on. When I first started, I
 planned to base is on the patch that you sent to -hackers some time ago (and
 actually, if you look at the repo for my module, twopo.c is still in there)
 but ended up doing it from scratch. However, reading your code was a big
 help in the beginning.

 I gave a talk at this year's PGCon
 (http://www.pgcon.org/2010/schedule/events/211.en.html) and you will find
 your name in the acknowledgments section of the presentation :)

 I'll make sure to read your new code and compare the approaches, my results
 so far are not perfect, but also not very pessimistic. I think there is
 actually a chance to replace GEQO with SA in the future, or at least to ship
 more join search modules with the standard distribution and gather field
 reports.

 Cheers,
 Jan


Hi Jan,

It's good to know that you are also interested about this issue!

I saw the slides of your presentation at PGCon and I noted excellent
ideas. I think that as more as implementations and ideas will be
raised, more easily will be to converge them in a reliable and
competitive solution.

The current version of TwoPO is suitable for the classic
select-project-join problem. Therefore, there are some issues not
covered yet, as I have observed in a test schema presented by Andres
Freund (http://archives.postgresql.org/pgsql-hackers/2009-07/msg00546.php).
I think we will need to build a robust benchmark to deal with it.

At this moment I'm really busy with a homework, but I hope to
dedicate more time to this issue next month.

--
Adriano Lange

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TwoPO: experimental join order algorithm

2010-07-30 Thread Jan Urbański

On 24/07/10 15:20, Adriano Lange wrote:

Hi,


Hi!



I'd like to release the last version of my experimental join order
algorithm (TwoPO - Two Phase Optimization [1]):

http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

This algorithm is not production-ready, but an experimental set of
ideas, which need to be refined and evaluated. As the join order
optimization is a hard problem, the evaluation of a search strategy is
also a hard task. Therefore, I think the most important TODO item
related to replacement of GEQO algorithm is to define a set of
evaluation criteria considered as relevant.


Good to hear from you --  I don't know if you are aware about a 
simulated annealing join search module that I'm working on. When I first 
started, I planned to base is on the patch that you sent to -hackers 
some time ago (and actually, if you look at the repo for my module, 
twopo.c is still in there) but ended up doing it from scratch. However, 
reading your code was a big help in the beginning.


I gave a talk at this year's PGCon 
(http://www.pgcon.org/2010/schedule/events/211.en.html) and you will 
find your name in the acknowledgments section of the presentation :)


I'll make sure to read your new code and compare the approaches, my 
results so far are not perfect, but also not very pessimistic. I think 
there is actually a chance to replace GEQO with SA in the future, or at 
least to ship more join search modules with the standard distribution 
and gather field reports.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TwoPO: experimental join order algorithm

2010-07-25 Thread Robert Haas
On Sat, Jul 24, 2010 at 9:20 AM, Adriano Lange alange0...@gmail.com wrote:
 I'd like to release the last version of my experimental join order
 algorithm (TwoPO - Two Phase Optimization [1]):

 http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

 This algorithm is not production-ready, but an experimental set of
 ideas, which need to be refined and evaluated. As the join order
 optimization is a hard problem, the evaluation of a search strategy is
 also a hard task. Therefore, I think the most important TODO item
 related to replacement of GEQO algorithm is to define a set of
 evaluation criteria considered as relevant.

As you may know, we're in the middle of a CommitFest right now; I'd
suggest adding this patch to the next one.

https://commitfest.postgresql.org/action/commitfest_view/open

Someone might have time to look at it sooner, but at least if you add
it here we'll not lose track of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TwoPO: experimental join order algorithm

2010-07-25 Thread Adriano Lange
Em 25-07-2010 17:44, Robert Haas escreveu:
 On Sat, Jul 24, 2010 at 9:20 AM, Adriano Lange alange0...@gmail.com wrote:
 I'd like to release the last version of my experimental join order
 algorithm (TwoPO - Two Phase Optimization [1]):

 http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

 This algorithm is not production-ready, but an experimental set of
 ideas, which need to be refined and evaluated. As the join order
 optimization is a hard problem, the evaluation of a search strategy is
 also a hard task. Therefore, I think the most important TODO item
 related to replacement of GEQO algorithm is to define a set of
 evaluation criteria considered as relevant.
 
 As you may know, we're in the middle of a CommitFest right now; I'd
 suggest adding this patch to the next one.
 
 https://commitfest.postgresql.org/action/commitfest_view/open
 
 Someone might have time to look at it sooner, but at least if you add
 it here we'll not lose track of it.
 

Yes, I know. This is only a notice, not a patch.
As I said, this algorithm is experimental, which do not match with the
CommitFest life cycle.

--
Adriano Lange

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TwoPO: experimental join order algorithm

2010-07-25 Thread Robert Haas
On Sun, Jul 25, 2010 at 6:45 PM, Adriano Lange alange0...@gmail.com wrote:
 Yes, I know. This is only a notice, not a patch.
 As I said, this algorithm is experimental, which do not match with the
 CommitFest life cycle.

It matches just fine - you just want a review and some good feedback,
rather than an actual commit.  It's just... I don't have time to do
that tonight.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TwoPO: experimental join order algorithm

2010-07-25 Thread Adriano Lange
Em 25-07-2010 19:17, Robert Haas escreveu:
 On Sun, Jul 25, 2010 at 6:45 PM, Adriano Lange alange0...@gmail.com wrote:
 Yes, I know. This is only a notice, not a patch.
 As I said, this algorithm is experimental, which do not match with the
 CommitFest life cycle.
 
 It matches just fine - you just want a review and some good feedback,
 rather than an actual commit.  It's just... I don't have time to do
 that tonight.  :-)
 

:-)
Just a notice of an experimental algorithm for now.
I think there are no clear criteria for review this kind of algorithm yet.

--
Adriano Lange

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] TwoPO: experimental join order algorithm

2010-07-24 Thread Adriano Lange
Hi,

I'd like to release the last version of my experimental join order
algorithm (TwoPO - Two Phase Optimization [1]):

http://git.c3sl.ufpr.br/gitweb?p=lbd/ljqo.git;a=summary

This algorithm is not production-ready, but an experimental set of
ideas, which need to be refined and evaluated. As the join order
optimization is a hard problem, the evaluation of a search strategy is
also a hard task. Therefore, I think the most important TODO item
related to replacement of GEQO algorithm is to define a set of
evaluation criteria considered as relevant.

TwoPO is encapsulated in a plug-in called LJQO (Large Join Query
Optimization [2]). This plug-in has two main GUC variables:

ljqo_threshold = N (like geqo_threshold)
ljqo_algorithm = {twopo|geqo}

As its name means, TwoPO has internally two search strategies that
constitute its two phases of optimization:

 * Iterative Improvement (II) – Biased Sampling + Local Search
 * Simulated Annealing (SA)

This algorithm also works in two search spaces:

 * deep-trees (subset of bushy-trees)
- list of baserels
- initial states: biased sampling using
  Prim algorithm over join graph (new, very efficient)
- moves: swap, 3cycle [2]
 * bushy-trees
- binary join tree representation
- initial states: biased sampling using
  Kruskal's algorithm over join graph [3,4].
- moves: associative

You can modify the functionality of TwoPO through the following parameters:

twopo_bushy_space = {true|false}
  - set it to false if you want only deep-trees
default=true
twopo_heuristic_states = {true|false}
  - enables heuristic to initial states
default=true
twopo_ii_stop = Int
  - number of initial states
default=10
twopo_ii_improve_states = {true|false}
  - find local-minimum of each initial state
default=true
twopo_sa_phase = {true|false}
  - enables Simulated Annealing (SA) phase
default=true
twopo_sa_initial_temperature = Float
  - initial temperature for SA phase
default=0.1
twopo_sa_temperature_reduction = Float
  - temperature reduction
default=0.95
twopo_sa_equilibrium = Int
  - number of states generated for each temperature
(Int * State Size)
default=16


References:

[1] Yannis E. Ioannidis e Younkyung Kang. Randomized algorithms for
optimizing large join queries. SIGMOD Rec., 19(2):312-321, 1990.

[2] Arun Swami e Anoop Gupta. Optimization of large join queries. SIGMOD
'88: Proceedings of the 1988 ACM SIGMOD international conference on
Management of data, pages 8-17, New York, NY, USA, 1988. ACM.

[3] P.B. Guttoski, M. S. Sunye, e F. Silva. Kruskal's algorithm for
query tree optimization. IDEAS '07: Proceedings of the 11th
International Database Engineering and Applications Symposium, pages
296-302, Washington, DC, USA, 2007. IEEE Computer Society.

[4] Florian Waas e Arjan Pellenkoft. Join order selection - good enough
is easy. BNCOD, pages 51-67, 2000.


Att,
Adriano Lange


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers