Hey Matthew, Welcome to Drill. Its great to have you volunteering. Thanks. What are the parts you would be interested in? Would you like to explore the JsonRecordReader and HBase/Mongo Plugin ?
These are the components in a good state and there is lot to learn from these. I would love to work with you on any issues/doubts you have and we can continue on new development. Regarrds On Fri, Feb 27, 2015 at 10:25 PM, Matthew Burgess <[email protected]> wrote: > I am interested in participating w.r.t. JDBC topics, both the storage > plugin > and Drill's JDBC driver. Unfortunately I haven't had much time to > contribute, but if there are smaller tasks I could take, I would definitely > be interested. > > Regards, > Matt > > From: Yash Sharma <[email protected]> > Reply-To: <[email protected]> > Date: Friday, February 27, 2015 at 11:45 AM > To: "[email protected]" <[email protected]> > Subject: Re: drill support mysql? > > Hi All/Linfeng, > I would like to explore the options here and am reopening the thread for > more information on the work done so far and to get more fellow > contributors interested in the same. > > Thanks, > Yash > > On Fri, Jan 16, 2015 at 12:57 AM, Jason Altekruse < > [email protected]> > wrote: > > > That's what we were thinking. This would make the only work that needs > to > > be done is to implement a mapping between Enumerable and value vectors > for > > getting the data into Drill. > > > > Linfeng, does this sound like something you would be interested in > trying > > to contribute to Drill? > > > > -Jason > > > > On Thu, Jan 15, 2015 at 10:59 AM, Julian Hyde <[email protected]> > > wrote: > > > >> > Calcite has a JDBC adapter, so it can convert parts of the query > plan to > >> > SQL and execute it in the JDBC source. Generally you want to push > down as > >> > much as possible. It works with MySQL and pretty much any JDBC data > > source. > >> > > >> > Drill uses Calcite internally, so if Drill exposed Calcite adapters > as > >> > data sources this problem would be solved. > >> > > >> > Julian > >> > > >>> > > On Jan 15, 2015, at 10:30 AM, Jason Altekruse < > > [email protected]> > >> > wrote: > >>> > > > >>> > > Hello Linfeng! > >>> > > > >>> > > Welcome to the Drill community! > >>> > > > >>> > > Currently drill does not support querying traditional databases > like > >> > MySQL, > >>> > > but it is a feature we have discussed adding for some time. If > you are > >>> > > interested in trying to add your own, you can start by taking a > look at > >>> > > some of the existing storage plugins. For a basic overview of the > > process > >>> > > of populating the in memory data structure for records in Drill, I > > would > >>> > > recommend reading through the Json reader implementation. While > the in > >>> > > memory format for Drill is columnar, to allow for various > execution > >>> > > optimizations, a simple row-by-row interface is provided for > writing > > into > >>> > > the data structure. > >>> > > > >>> > > > >> > > > > > > https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/a > > pache/drill/exec/vector/complex/fn/JsonReader.java > >>> > > > >>> > > Another important aspect of creating a storage plugin is > leveraging the > >>> > > underlying systems capabilities to avoid reading an entire > dataset for > >> > each > >>> > > query. We have a system in place for pushing down filters and > selected > >>> > > columns to the underlying storage layer. This is not necessary to > be > > able > >>> > > to run queries in Drill, as you can simply read all of the data > out of > >> > the > >>> > > storage system and let Drill filter it out as necessary, but for > many > >>> > > workloads this will obviously be sub-optimal in many cases ( just > about > >>> > > anything but a select *). For an example of some of this filter > > rewriting > >>> > > you can see the work done on the Mongo plugin. > >>> > > > >>> > > > >> > > > > > > https://github.com/apache/drill/tree/master/contrib/storage-mongo/src/main/jav > > a/org/apache/drill/exec/store/mongo > >>> > > > >>> > > An important thing to note here is that the filters have to be > > rewritten > >>> > > from Drill convention to Mongo convention. In the case of a MySQL > > plugin, > >>> > > the storage system really could perform any sql operations that > require > >>> > > only the tables within your MySQL instance and likely further > optimize > >>> > > those subsets of the queries. To implement this functionality, we > could > >> > go > >>> > > far beyond what we have implemented with other storage plugins in > terms > >> > of > >>> > > pushdown, which requires more complex rewriting. Thankfully a > > dependency > >>> > > that Drill is already leveraging for its planning, calcite, has > the > >>> > > capabilities to do these types of more complex pushdowns already. > It > > uses > >>> > > an Enumerable interface (a fancy iterator) to expose data from > one of > > its > >>> > > storage engines, such as a relational database, calcite includes a > >> > simpler > >>> > > single node execution engine that can actually act much like > Drill for > >>> > > smaller workloads. Likely the best first step for getting MySQL > working > >>> > > would be to write a Drill storage plugin that takes an Enumerable > as > >> > input, > >>> > > then we should be able to leverage all of calcite's existing > >> > functionality > >>> > > for doing the query rewrites and pushdowns relatively easily. > >>> > > > >>> > > -Jason Altekruse > >>> > > > >>> > > On Wed, Jan 7, 2015 at 7:39 PM, [email protected] < > >>> > > [email protected]> wrote: > >>> > > > >>>> > >> 你好: > >>>> > >> 请问drill目前是否支持mysql > >>>> > >> 如果我自己想开发一套连接mysql的插件我该怎么做? > >>>> > >> > >>>> > >> Hello: > >>>> > >> Does the drill is currently supports mysql > >>>> > >> If I want to develop a set of connection mysql plugin what > should I > > do? > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> [email protected] > >>>> > >> > >> > > >> > > > > > > >
