Hello all, I wanted to continue the discussion around storage plugins. Firstly, I'm really glad that Paul has developed the Base Storage plugin PR as that can greatly simplify one of the most complex issues around storage plugins: filter pushdowns.
With that said, I've been working on a storage plugin for Cassandra and have been looking into the Calcite adapters (https://calcite.apache.org/docs/adapter.html <https://calcite.apache.org/docs/adapter.html>). These schema adapters seem to do three things: 1. Discover and map the schema of a data source 2. Handle all the creation of Calcite rules (Filter, Projection, Aggregation, Limit) 3. Parse the query results. Of these, I don't think we can use the results parser but the other two parts could potentially be very useful to Drill. Anyway here's what I'm wondering. Would it be possible to 1. Create some sort of wrapper for the Calcite adapters. I believe for the most part, they all have the same classes, but are different depending on the data source. 2. Write a sort of generic storage plugin that accepts a calcite adapter. (CalciteAdapterStoragePlugin) Once these steps are done, in theory it would be relatively easy to extend the CalciteAdapterStoragePlugin for every data source for which there is an adapter. This list is really long and includes many data stores that we don't currently support, including: Cassandra Elasticsearch Druid Geode Solr Spark Splunk I'm not suggesting this as a replacement for the Base Storage Plugin. My thought is that we can use the work that the Calcite community has done to radically improve Drill's versatility. I looked at the code for Elasticsearch and the query planning and pushdowns are really well done. It supports every version of ES from 2 to current. The downside is that the documentation is, how shall we say, lacking. I'd be happy to work on this, but would definitely need some assistance. Thoughts? -- C
