Re: Looking for advice on integrating with a custom data source

2020-01-16 Thread Andy Grove
Hi Charles, I would like to be able to contribute something out of this effort. The PoC I am working on is quite fluid at the moment but one possible outcome is that this storage engine ends up supporting Arrow Flight, but I'm not sure yet. Andy. On Wed, Jan 15, 2020 at 7:19 AM Charles Givre

Re: Looking for advice on integrating with a custom data source

2020-01-15 Thread Charles Givre
Andy, Glad to hear you got it working!! Can you share what data source you are working with? Is it completely custom to your organization? If not, would you consider submitting this as a pull request? Best, -- C > On Jan 15, 2020, at 9:07 AM, Andy Grove wrote: > > And boom! With just 3

Re: Looking for advice on integrating with a custom data source

2020-01-15 Thread Andy Grove
And boom! With just 3 extra lines of code to adjust the CBO to make the row count inversely proportional to the number of predicates, my little Poc works :-) Now that I've achieved the instant gratification (relatively speaking!) of making something work, I think it's time to step back and start

Re: Looking for advice on integrating with a custom data source

2020-01-14 Thread Paul Rogers
Hi Andy, Congratulations on making such fast progress! The code to do filter pushdowns is rather complex and, it seems, most plugins copy/paste the same wad of code (with the same bugs). PR 1914 provides a layer that converts the messy Drill logical plan into a nice, simple set of predicates.

Re: Looking for advice on integrating with a custom data source

2020-01-14 Thread Andy Grove
With some extra debugging I can see that the getNewWithChildren call is made to an earlier instance of GroupScan and not the instance created by the filter push-down rule. I'm wondering if this is some kind of hashCode/equals/toString/getDigest issue? On Tue, Jan 14, 2020 at 7:52 PM Andy Grove

Re: Looking for advice on integrating with a custom data source

2020-01-14 Thread Andy Grove
I'm now working on predicate push down ... I have a filter rule that is correctly extracting the predicates that the backend database supports and I am creating a new GroupScan containing these predicates, using the Kafka plugin as a reference. I see the GroupScan constructor being called after

Re: Looking for advice on integrating with a custom data source

2020-01-12 Thread Paul Rogers
Hi Andy, Congrats! You are making good progress. Yes, the BatchCreator is a bit of magic: Drill looks for a subclass that has your SubScan subclass as the second parameter. Looks like you figured that out. Thanks, - Paul On Sunday, January 12, 2020, 1:45:16 PM PST, Andy Grove wrote:

Re: Looking for advice on integrating with a custom data source

2020-01-12 Thread Andy Grove
Actually I managed to get past that error with an educated guess that if I created a BatchCreator class, it would automagically be picked up somehow. I'm now at the point where my RecordReader is being invoked! On Sun, Jan 12, 2020 at 2:03 PM Andy Grove wrote: > Between reading the tutorial and

Re: Looking for advice on integrating with a custom data source

2020-01-12 Thread Andy Grove
Between reading the tutorial and copying and pasting code from the Kudu storage plugin, I've been making reasonable progress with this but am I but confused by one error I'm now hitting. ExecutionSetupException: Failure finding OperatorCreator constructor for config com.mydb.MyDbSubScan Prior to

Re: Looking for advice on integrating with a custom data source

2020-01-11 Thread Andy Grove
Thank you both for the those responses. This is very helpful. I have ordered a copy of the book too. I'm using Drill 1.17.0. I'll take a look at the Jdbc Storage Plugin code and see if it would be feasible to add the logic I need there. In parallel, I've started implementing a new storage plugin.

Re: Looking for advice on integrating with a custom data source

2020-01-11 Thread Charles Givre
HI Andy, Thanks for your interest in Drill. I'm glad to see that Paul wrote you back as well. I was going to say I thought the JDBC storage plugin did in fact push down columns and filters to the source system. Also, what version of Drill are you using? Writing a storage plugin for Drill

Re: Looking for advice on integrating with a custom data source

2020-01-11 Thread Paul Rogers
Hi Andy, There are likely multiple approaches; here are two. Some bit of code has to decide what can be pushed to your data source and what must remain in Drill. At present, there is no declarative way to say, "OK to push such-and-so expression, but keep this-and-that." Instead, the current

Looking for advice on integrating with a custom data source

2020-01-11 Thread Andy Grove
Hi, I'd like to use Apache Drill with a custom data source that supports a subset of SQL. My goal is to have Drill push selection and predicates down to my data source but the rest of the query processing should take place in Drill. I started out by writing a JDBC driver for the data source and