Hey all,

As you may see from the uber thread that evolved from Neeraja bringing
Caravel to the Drill community, there has been lots of back and forth on
the subject.  I thought I'd give a little update, as well as a call to arms
so to speak, for anyone who wants to play with Drill/Caravel and find
issues.

First of all, mostly due to the hard work of PythonicNinja (Wojtek Nowak)
there is a working demo of Drill being used as the backend for Caravel
visualizations.  This to me is a great sign that the work being is not only
worth it, but something that could really be a benefit to the Drill
community. Thanks again Wojtek for your great work.

*Current Status*

We have two repos we are maintaining. One is a "dev" environment that
includes a Dockerfile to show how to get thing setup and working with a
number of components: Caravel, PyODBC, UnixODBC, MapR ODBC Connector, etc.
This is:

https://github.com/JohnOmernik/caraveldrill

This basically has everything you may need to connect to Drill.  It does
NOT include Drill, it assumes you have a Drill instance to connect to.
This allows you to run caravel with the test data, and includes a test
python script for making a connection to drill. If the script succeeds for
you in showing schemas, you can then see in the output the exact connection
string you'd need for Caravel (Thanks Chris Matta from MapR for sharing his
repo using PyODBC, I used that quite a bit in proving this out.  Source
Repo: https://github.com/cjmatta/DrillPandasReddit

The other component is the actual SQL Alchemy dialect for Drill.  This is
in rougher shape, and it will need the crowd sourced effort.  Basically, it
works. (to install take a running container of the caraveldrill repo,
connect to it via docker exec -it %containerid% /bin/bash  Now you from
within the container, you can install the Dialect.   Once you run the
install on the dialect, it will work and you can connect via Caravel!

This is sorta a hack, we started with a MS Access Dialect and are slowly
removing obvious Access only parts (functions related to primary/foreign
keys, indexes, etc) and adding/replacing parts that have to do with Drill.
It's a learning process for us.  If you think you can address the issues in
Drill please submit PRs, this will be highly iterative and things will move
fast. The end result here should be a nice dialect for Drill that handles
all the Caravel functionality, and potentially could be used in other
projects.

The dialect Repo is located here:

https://github.com/JohnOmernik/sqlalchemy-drill/

Once again, thanks to all, especially Wojtek who has contributed thus far,
and I am looking forward to seeing this evolve!

John Omernik

Reply via email to