Rule to expand tha JdbcTableScan expression

Christian Tzolov Tue, 29 Nov 2016 14:08:22 -0800

We're working on a JDBC wrapper for HAWQ (SQL-on-HADOOP) that hopefully
would allow us "emulate" UPDATE and DELETE operations while using only
append-only (e.g. INSERT) in the background.


Using the JDBC adapter i've been trying to convert an input sql query like
this:

SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
amount  FROM foodmart.expense_fact

Into one that looks like this:

SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
amount
FROM (
    SELECT store_id, account_id, exp_date, time_id, category_id,
currency_id, amount
    FROM (
        SELECT *, MAX(exp_date) OVER (PARTITION BY account_id) AS
last_version_number
        FROM foodmart.expense_fact
     ) as link_last
    WHERE exp_date = last_version_number
) as current_version;

If you run the second query directly the output relation is:

"PLAN=JdbcToEnumerableConverter
  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5], amount=[$6])
    JdbcFilter(condition=[=($2, $7)])
      JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
                  last_version_number=[MAX($2) OVER (PARTITION BY $1 RANGE
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
        JdbcTableScan(table=[[foodmart, expense_fact]])


I've created a rule meant to do this:
https://gist.github.com/tzolov/a426d10a55fdd712a67830b0154b33db   On match
it creates an expression that matches the one above:

"Expanded RelNode:
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5], amount=[$6])
  JdbcFilter(condition=[=($2, $7)])
    JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
last_version_number=[MAX($2) OVER (PARTITION BY $1 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING)])
      JdbcTableScan(table=[[foodmart, expense_fact]])"

But after the Planner completes the optimization the final output is this
(e.g. the initial query):

"PLAN=JdbcToEnumerableConverter
  JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5])
    JdbcTableScan(table=[[foodmart, expense_fact]])

1. What am i missing? Are the field/type references copied correctly?

2. Also what is the right approach to prevent this Rule get in loop?

I'm a bit stuck so any ideas and suggestions appreciated!

Cheers,
Christian

-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
[email protected]|+31610285517

Rule to expand tha JdbcTableScan expression

Reply via email to