We're working on a JDBC wrapper for HAWQ (SQL-on-HADOOP) that hopefully
would allow us "emulate" UPDATE and DELETE operations while using only
append-only (e.g. INSERT) in the background.
Using the JDBC adapter i've been trying to convert an input sql query like
this:
SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
amount FROM foodmart.expense_fact
Into one that looks like this:
SELECT store_id, account_id, exp_date, time_id, category_id, currency_id,
amount
FROM (
SELECT store_id, account_id, exp_date, time_id, category_id,
currency_id, amount
FROM (
SELECT *, MAX(exp_date) OVER (PARTITION BY account_id) AS
last_version_number
FROM foodmart.expense_fact
) as link_last
WHERE exp_date = last_version_number
) as current_version;
If you run the second query directly the output relation is:
"PLAN=JdbcToEnumerableConverter
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5], amount=[$6])
JdbcFilter(condition=[=($2, $7)])
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
last_version_number=[MAX($2) OVER (PARTITION BY $1 RANGE
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)])
JdbcTableScan(table=[[foodmart, expense_fact]])
I've created a rule meant to do this:
https://gist.github.com/tzolov/a426d10a55fdd712a67830b0154b33db On match
it creates an expression that matches the one above:
"Expanded RelNode:
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5], amount=[$6])
JdbcFilter(condition=[=($2, $7)])
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2],
time_id=[$3], category_id=[$4], currency_id=[$5], amount=[$6],
last_version_number=[MAX($2) OVER (PARTITION BY $1 ROWS BETWEEN UNBOUNDED
PRECEDING AND UNBOUNDED FOLLOWING)])
JdbcTableScan(table=[[foodmart, expense_fact]])"
But after the Planner completes the optimization the final output is this
(e.g. the initial query):
"PLAN=JdbcToEnumerableConverter
JdbcProject(store_id=[$0], account_id=[$1], exp_date=[$2], time_id=[$3],
category_id=[$4], currency_id=[$5])
JdbcTableScan(table=[[foodmart, expense_fact]])
1. What am i missing? Are the field/type references copied correctly?
2. Also what is the right approach to prevent this Rule get in loop?
I'm a bit stuck so any ideas and suggestions appreciated!
Cheers,
Christian
--
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Solution Architect,
EMEA Practice Team | Pivotal <http://pivotal.io/>
[email protected]|+31610285517