Hi, I wanted to take up this as a GSOC 2012 project.
SQL supports nested queries. When the inner query contains a correlation variable the present optimizer takes an iterative execution plan. If the inner query scans over a relation, the iterative plan chosen can be sub-optimal. The goal of this project is to enable De-correlation for all possible cases. The iterative execution plan can be converted to a set oriented plan by taking a join over the base relations involved. Sufficient work has already been done in this area and has been implemented in SQL Server. The changes required to incorporate the above mentioned strategy is in rewriting phase to the best of my knowledge. The key idea is to introduce the APPLY operator in the raw parse tree. In the above mentioned Papers, the author has mentioned the process of removing the apply. The author has proposed a set of rules which will allow us to achieve the goal. The present postgresql optimizer community has done some work in these lines for simple subqueries involving =, > , < in the predicates [ I observed it by seeing the result of EXPLAIN for relevant queries ]. The optimization is not done for subqueries containing aggregate queries and existential and containment queries. An example query from TPCH benchmark discussed by the author: select c_custkey from customer where 1000000 < (select sum(o_totalprice) from orders where o_custkey = c_custkey) In the above case, c_custkey is a correlation variable (parameterized) coming from the outer query. Hence in the present system, the inner query is executed as many times as the tuples in the customer relation. As the subQuery involves a scan over orders relation, the total I/O cost involved is pretty high. Using the transformation proposed by the author, the query can be re-written as select c_custkey from customer left outer join orders on o_custkey = c custkey group by c_custkey having 1000000 < sum(o_totalprice) This allows the optimizer to chose a better plan for left outer join and avoid iterative execution. The idea here is not to get a rewritten query as output but to generate a plan tree that does the same as the above query. Regards, Arun -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers