Re: Rule to expand tha JdbcTableScan expression

Julian Hyde Tue, 29 Nov 2016 15:48:06 -0800

What you've written isn't really a planner rule. A planner rule is
supposed to create something semantically equivalent. And the planner
then chooses the cheapest of those equivalent expressions.


In contrast, you want to do search-and-replace on the tree. I think
you should use a RelVisitor, or perhaps replace TableScan with a table
macro and let the macro expander do its stuff.

(There are other cases where planner rules don't cut it. One is field
trimming, because when you visit each node, you want to return a node
that has fewer fields.)

On Tue, Nov 29, 2016 at 2:07 PM, Christian Tzolov <[email protected]> wrote:
> We're working on a JDBC wrapper for HAWQ (SQL-on-HADOOP) that hopefully
> would allow us "emulate" UPDATE and DELETE operations while using only
> append-only (e.g. INSERT) in the background.
>
> Using the JDBC adapter i've been trying to convert an input sql query like
> this:
>
> SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
> amount  FROM foodmart.expense_fact
>
> Into one that looks like this:
>
> SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
> amount
> FROM (
>     SELECT store_id, account_id, exp_date, time_id, category_id,
> currency_id, amount
>     FROM (
>         SELECT *, MAX(exp_date) OVER (PARTITION BY account_id) AS
> last_version_number
>         FROM foodmart.expense_fact
>      ) as link_last
>     WHERE exp_date = last_version_number
> ) as current_version;
>
> If you run the second query directly the output relation is:
>
> "PLAN=JdbcToEnumerableConverter
>   JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> category_id=[$4], currency_id=[$5], amount=[$6])
>     JdbcFilter(condition=[=($2, $7)])
>       JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
>                   last_version_number=[MAX($2) OVER (PARTITION BY $1 RANGE
> BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
>         JdbcTableScan(table=[[foodmart, expense_fact]])
>
>
> I've created a rule meant to do this:
> https://gist.github.com/tzolov/a426d10a55fdd712a67830b0154b33db   On match
> it creates an expression that matches the one above:
>
> "Expanded RelNode:
> JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> category_id=[$4], currency_id=[$5], amount=[$6])
>   JdbcFilter(condition=[=($2, $7)])
>     JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
> time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
> last_version_number=[MAX($2) OVER (PARTITION BY $1 ROWS BETWEEN UNBOUNDED
> PRECEDING AND UNBOUNDED FOLLOWING)])
>       JdbcTableScan(table=[[foodmart, expense_fact]])"
>
> But after the Planner completes the optimization the final output is this
> (e.g. the initial query):
>
> "PLAN=JdbcToEnumerableConverter
>   JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
> category_id=[$4], currency_id=[$5])
>     JdbcTableScan(table=[[foodmart, expense_fact]])
>
> 1. What am i missing? Are the field/type references copied correctly?
>
> 2. Also what is the right approach to prevent this Rule get in loop?
>
> I'm a bit stuck so any ideas and suggestions appreciated!
>
> Cheers,
> Christian
>
> --
> Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
> EMEA Practice Team | Pivotal <http://pivotal.io/>
> [email protected]|+31610285517

Re: Rule to expand tha JdbcTableScan expression

Reply via email to