Hello all, 
I wanted to continue the discussion around storage plugins.  Firstly, I'm 
really glad that Paul has developed the Base Storage plugin PR as that can 
greatly simplify one of the most complex issues around storage plugins: filter 
pushdowns.

With that said, I've been working on a storage plugin for Cassandra and have 
been looking into the Calcite adapters 
(https://calcite.apache.org/docs/adapter.html 
<https://calcite.apache.org/docs/adapter.html>).  These schema adapters seem to 
do three things:  
1.  Discover and map the schema of a data source
2.  Handle all the creation of Calcite rules (Filter, Projection, Aggregation, 
Limit) 
3.  Parse the query results.

Of these, I don't think we can use the results parser but the other two parts 
could potentially be very useful to Drill.  Anyway here's what I'm wondering.  
Would it be possible to 
1.  Create some sort of wrapper for the Calcite adapters.  I believe for the 
most part, they all have the same classes, but are different depending on the 
data source.
2.  Write a sort of generic storage plugin that accepts a calcite adapter.  
(CalciteAdapterStoragePlugin)

Once these steps are done, in theory it would be relatively easy to extend the 
CalciteAdapterStoragePlugin for every data source for which there is an 
adapter.  This list is really long and includes many data stores that we don't 
currently support, including:
Cassandra
Elasticsearch
Druid
Geode
Solr
Spark 
Splunk 

I'm not suggesting this as a replacement for the Base Storage Plugin.  My 
thought is that we can use the work that the Calcite community has done to 
radically improve Drill's versatility.  I looked at the code for Elasticsearch 
and the query planning and pushdowns are really well done.  It supports every 
version of ES from 2 to current. 

The downside is that the documentation is, how shall we say, lacking.  I'd be 
happy to work on this, but would definitely need some assistance. 
Thoughts?
-- C



Reply via email to