[
https://issues.apache.org/jira/browse/CALCITE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458194#comment-16458194
]
Julian Hyde commented on CALCITE-2280:
--------------------------------------
Thanks for the feedback, [~eee].
* I've been interested in OData for a while. A nice integration between
Calcite and Becquerel sounds very desirable.
* I hear data lineage a lot. I'm never sure exactly what API or data model we
should provide for lineage. Can you write a couple of lines of test-case code
that shows how you would like lineage to be returned?
* Ah, DDL. I had hoped we could avoid DDL initially, because it varies so much
between engines. But if people want it, we can do it.
A list of sample queries in various dialects would be great. Format doesn't
matter - it could even be a text file. We need the SQL text, the dialect(s)
that support it, and optionally a comment about what it is trying to test.
Preferably queries that exercise each dialect's idiosyncracies. For example,
from line 1193 of [QueryParser's
Test.hs|https://github.com/uber/queryparser/blob/b1923a9cdf71598211eb8f4742f2aefa95e997f0/test/Database/Sql/Util/Scope/Test.hs#L1193]
I would extract the 4-tuple:
* comment: Hive semi join
* sql: "SELECT * FROM foo LEFT SEMI JOIN bar ON (foo.a = bar.a) WHERE bar.b is
not null"
* dialects: hive
* schema: fooBar
We could use as many of these as you can produce with minimal effort on your
part.
> "Super-liberal" parser that accepts all SQL dialects
> ----------------------------------------------------
>
> Key: CALCITE-2280
> URL: https://issues.apache.org/jira/browse/CALCITE-2280
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
> Priority: Major
>
> Create a parser that accepts all SQL dialects.
> It would accept common dialects such as Oracle, MySQL, PostgreSQL, BigQuery.
> If you have preferred dialects, please let us know in the comments section.
> (If you're willing to work on a particular dialect, even better!)
> We would do this in a new module, inheriting and extending the parser in the
> same way that the DDL parser in the "server" module does.
> This would be a messy and difficult project, because we would have to comply
> with the rules of each parser (and its set of built-in functions) rather than
> writing the rules as we would like them to be. That's why I would keep it out
> of the core parser. But it would also have large benefits.
> This would be new territory Calcite: as a tool for manipulating/understanding
> SQL, not (necessarily) for relational algebra or execution.
> Some possible uses:
> * analyze query lineage (what tables and columns are used in a query);
> * translate from one SQL dialect to another (using the JDBC adapter to
> generate SQL in the target dialect);
> * a "deep" compatibility mode (much more comprehensive than the current
> compatibility mode) where Calcite could pretend to be, say, Oracle;
> * SQL parser as a service: a REST call gives a SQL query, and returns a JSON
> or XML document with the parse tree.
> If you can think of interesting uses, please discuss in the comments.
> There are similarities with Uber's
> [QueryParser|https://eng.uber.com/queryparser/] tool. Maybe we can
> collaborate, or make use of their test cases.
> We will need a lot of sample queries. If you are able to contribute sample
> queries for particular dialects, please discuss in the comments section. It
> would be good if the sample queries are based on a familiar schema (e.g.
> scott or foodmart) but we can be flexible about this.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)