nevi-me commented on issue #1323:
URL: https://github.com/apache/arrow-rs/issues/1323#issuecomment-1042111353
@e-dard there was a discussion a while ago about RDBMS query passthrough in
Datafusion. I tried it out using @alamb's excellent guide with the TopK node.
My approach was to create a custom optimizer that understands the source of
`LogicalPlan::TableScan`'s source (e.g. if we've registered a Postgres schema
in the catalog).
That way, if a plan includes a join of 2 tables from the same DB, they could
be rewritten into a SQL join query. Or when we have something like:
```sql
select sum(a), avg(b), max(c), d from postgres.my_table
where <filter expressions>
group by d
```
For the above query, I've previously been able to convert the logical plan:
```
|_ Aggregate (sum(a), avg(b), min(c)) by d
|_ Filter (<filter expr>)
|_ Project (a, b, c, d)
|_ TableScan (select * from postgres.my_table)
```
Into the below by implementing `TableScan` and converting filter-pushdown
predicates to SQL expressions
```
|_ Aggregate (sum(a), avg(b), min(c)) by d
|_ TableScan (select a, b, c, d from postgres.my_table where <filter
expr>)
```
With aggregate, join, union etc pushdown missing.
Through trying to implement custom nodes & planners for RDBMS, I've also
found that it's quite possible to translate much of the `LogicalPlan` types
into a SQL query, though the code I was generating by hand would get very
nested, and could benefit from some further optimization ala Apache Calcite.
I appreciate that I'm very inactive of late, but I'm just mentioning it out
of interest, because I also thought of Flight SQL being the standard interface
to connect to various DBs when I saw the PR a few weeks ago. The alternative of
supporting N interfaces can get daunting. `connector-x` would help with that to
some extent, but I would prefer if DB implementations end up supporting Flight
SQL natively.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]