Did you consider Avatica? Identical goals, it works already, and there are clients in several languages.
Julian On Wed, Jul 20, 2016 at 10:35 AM, Chunhui Shi <[email protected]> wrote: > Cool. And we know that there are already many 'light weight' APIs soon > become the main stream APIs. > > On Tue, Jul 19, 2016 at 10:56 PM, Paul Rogers <[email protected]> wrote: > >> Hi All, >> >> As I’ve been playing with and learning about Drill, it struck me that >> Drill is a wonderful “industrial strength” query engine, but that the >> client API is a bit complex if all an app wants to do is execute a few >> queries. I wondered if we need an adapter between the full-blown Drill >> columnar, asynchronous RPC that Drill uses internally, and the row-based, >> synchronous API that most apps know and love. >> >> In thinking about a simpler client API, a few items came to mind: >> >> - We have the JDBC API for Java apps, but the internals of the current >> JDBC use the Drill client and so the JDBC jar is quite big (20MB). >> >> - The current client API is not versioned, requiring clients to be >> upgraded in lock-step with servers. Many admins, however, find it necessary >> to upgrade clients on a schedule different from that of the server. >> (Imagine upgrading dozens of desktop users at the same time as the Drill >> cluster.) Many of the traditional DB products version their interferes to >> simplify this task. >> >> - A cool feature of Drill is schema-on-read, which means Drill may >> encounter different schemas as data is read. At present, it is a bit hard >> for clients to consume different schemas. It turns out, however, that >> stored procedures provide something similar (multiple result sets) that we >> could leverage that idea to make schema changes into a first-class feature >> of the API. >> >> Playing around a bit in my spare time, I found that we can grab lots of >> ideas from “traditional” DB APIs to solve the above problems (and more): >> >> - A simplified client API provides a row-based view of results, with >> schema changes as a first-class API concept. >> - A “direct" version of the client can sit directly on top of the Drill >> Client, much like the current JDBC driver. >> - Because the client API is simple, it is easy to create a new wire >> protocol to carry the required row-based client messages. >> - That wire protocol enables a very light-weight remote version of the >> client API. >> - A new server implements the server-side of the new wire protocol. The >> server is an adapter: it converts the “retail” row-based API into the >> “wholesale” columnar API of Drill. >> - A new JDBC implementation uses the remote API instead of directly using >> the Drill Client API. >> >> Because the remote client has no dependencies on Drill (or, indeed, >> anything other than the JDK), it is very small. Indeed, the revised JDBC >> jar is about 1% of the size of the existing JDBC driver. (200KB instead of >> 20MB.) >> >> The result is a little prototype project called “Jig”. I’d like to toss it >> out to the community to see if this is something of interest to others. The >> code works just well enough to prove the concept, though I’ve left off the >> more “advanced” data types, multiple cursors per connection, and other >> details. >> >> The advantage for Java users is a simpler API, smaller JDBC driver, fewer >> dependencies and cross-version compatibility. >> >> If we add clients in other languages, then just about any language can >> easily query Drill without a Java or ODBC bridge. This would be handy for >> that Caravel integration project discussed here a month or so back. Also >> for data scientists who prefer Python or R. >> >> In case there is interest in this idea, a more detailed proposal is >> available: >> https://docs.google.com/document/d/1TpJOEUO-DBDGIidOML2_InpJ-fK4yHmsbV5ncqXT6pM >> >> The code is in a GitHub repo: https://github.com/paul-rogers/drill-jig >> >> The JIRA for this enhancement: DRILL-4791: >> https://issues.apache.org/jira/browse/DRILL-4791 >> >> This has been a great little learning exercise. Is this something that >> might we might want to take further? Thoughts on the approach taken? >> >> Thanks, >> >> - Paul >> >> >>
