Re: Limitations and driver questions

Julian Hyde Mon, 11 Sep 2017 15:06:40 -0700

See answers inline.

> On Sep 11, 2017, at 2:09 PM, Muhammad Gelbana <[email protected]> wrote:
> 
> I'm investigating if calcite can help my company provide an SQL interface
> to it's proprietary data engine.
> 
> While reading through the docs
> <https://calcite.apache.org/docs/tutorial.html>, in the *Current
> limitations* section, it says that Calcite doesn't support pushing down
> filtering, joins, aggregations but can push down table scans only. While in
> that in this page <https://calcite.apache.org/docs/>, it says "Calcite uses
> optimizer rules to push the JOIN and GROUP BY operations to the source
> database", so I guess the limitation paragraph is outdated ?


In short, yes, the “current limitations” paragraph is inaccurate.

It is talking about the example CSV adapter that the tutorial is based upon. We 
keep that adapter very simple so that it is easy to learn from. Also, because 
it is based on a file format (CSV) rather than a database engine, it will never 
be possible to push down more complex operations such as aggregation and joins.

By the way, the file adapter is a more functional adapter for reading files, 
CSV or otherwise. Use that for real applications.

Other adapters in Calcite, such as the JDBC, Cassandra and Druid adapters, can 
push down many more relational operations.

> 
> The page also says that "Calcite intentionally stays out of the business of
> storing and processing data." but I understand that Calcite implements some
> SQL operators, doesn't this mean that Calcite processes data ?

Calcite has simple implementations of the relational operators. They are based 
on iterators (the Enumerable interface) and therefore are single-threaded, 
non-distributed, pull-based, and use the heap for storing objects.

Other databases have much, much better implementations of operators 
(multi-threaded, distributed, using operator scheduling, and using efficient 
representations of data such as Apache Arrow).

Calcite’s operators are sufficient to execute just about any SQL query on 
modest amounts of data. Some engines use them

> 
> For the driver question, I went through the test cases for the CSV adapter
> and I found some tests getting a connection using:
> 
> Connection connection = DriverManager.getConnection("jdbc:csv:", info);
> 
> 
> But I couldn't figure out how could the CSV adapter register a driver by
> that URL (i.e. jdbc:csv) and where in the test cases is this driver
> registered ?

“jdbc:csv:” occurs in two tests but they are both annotated “@Ignore”, so they 
haven’t worked in a long time. 

You would need to override java.sql.Driver.getConnectStringPrefix() in your 
driver to allow it to accept other URLs.

Julian

Re: Limitations and driver questions

Reply via email to