[ 
https://issues.apache.org/jira/browse/CALCITE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458194#comment-16458194
 ] 

Julian Hyde commented on CALCITE-2280:
--------------------------------------

Thanks for the feedback, [~eee].
 * I've been interested in OData for a while. A nice integration between 
Calcite and Becquerel sounds very desirable.
 * I hear data lineage a lot. I'm never sure exactly what API or data model we 
should provide for lineage. Can you write a couple of lines of test-case code 
that shows how you would like lineage to be returned?
 * Ah, DDL. I had hoped we could avoid DDL initially, because it varies so much 
between engines. But if people want it, we can do it.

A list of sample queries in various dialects would be great. Format doesn't 
matter - it could even be a text file. We need the SQL text, the dialect(s) 
that support it, and optionally a comment about what it is trying to test. 
Preferably queries that exercise each dialect's idiosyncracies. For example, 
from line 1193 of [QueryParser's 
Test.hs|https://github.com/uber/queryparser/blob/b1923a9cdf71598211eb8f4742f2aefa95e997f0/test/Database/Sql/Util/Scope/Test.hs#L1193]
 I would extract the 4-tuple:
 * comment: Hive semi join
 * sql: "SELECT * FROM foo LEFT SEMI JOIN bar ON (foo.a = bar.a) WHERE bar.b is 
not null"
 * dialects: hive
 * schema: fooBar

We could use as many of these as you can produce with minimal effort on your 
part.

> "Super-liberal" parser that accepts all SQL dialects
> ----------------------------------------------------
>
>                 Key: CALCITE-2280
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2280
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>            Priority: Major
>
> Create a parser that accepts all SQL dialects.
> It would accept common dialects such as Oracle, MySQL, PostgreSQL, BigQuery. 
> If you have preferred dialects, please let us know in the comments section. 
> (If you're willing to work on a particular dialect, even better!)
> We would do this in a new module, inheriting and extending the parser in the 
> same way that the DDL parser in the "server" module does.
> This would be a messy and difficult project, because we would have to comply 
> with the rules of each parser (and its set of built-in functions) rather than 
> writing the rules as we would like them to be. That's why I would keep it out 
> of the core parser. But it would also have large benefits.
> This would be new territory Calcite: as a tool for manipulating/understanding 
> SQL, not (necessarily) for relational algebra or execution.
> Some possible uses:
> * analyze query lineage (what tables and columns are used in a query);
> * translate from one SQL dialect to another (using the JDBC adapter to 
> generate SQL in the target dialect);
> * a "deep" compatibility mode (much more comprehensive than the current 
> compatibility mode) where Calcite could pretend to be, say, Oracle;
> * SQL parser as a service: a REST call gives a SQL query, and returns a JSON 
> or XML document with the parse tree.
> If you can think of interesting uses, please discuss in the comments.
> There are similarities with Uber's 
> [QueryParser|https://eng.uber.com/queryparser/] tool. Maybe we can 
> collaborate, or make use of their test cases.
> We will need a lot of sample queries. If you are able to contribute sample 
> queries for particular dialects, please discuss in the comments section. It 
> would be good if the sample queries are based on a familiar schema (e.g. 
> scott or foodmart) but we can be flexible about this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to