With some extra debugging I can see that the getNewWithChildren call is made to an earlier instance of GroupScan and not the instance created by the filter push-down rule. I'm wondering if this is some kind of hashCode/equals/toString/getDigest issue?
On Tue, Jan 14, 2020 at 7:52 PM Andy Grove <[email protected]> wrote: > I'm now working on predicate push down ... I have a filter rule that is > correctly extracting the predicates that the backend database supports and > I am creating a new GroupScan containing these predicates, using the Kafka > plugin as a reference. I see the GroupScan constructor being called after > this, with the predicates populated So far so good ... but then I see calls > to getDigest, getScanStats, and getNewWithChildren, and then I see calls to > the GroupScan constructor with the predicates missing. > > Any pointers on what I might be missing? Is there more magic I need to > know? > > Thanks! > > On Sun, Jan 12, 2020 at 5:34 PM Paul Rogers <[email protected]> > wrote: > >> Hi Andy, >> >> Congrats! You are making good progress. Yes, the BatchCreator is a bit of >> magic: Drill looks for a subclass that has your SubScan subclass as the >> second parameter. Looks like you figured that out. >> >> Thanks, >> - Paul >> >> >> >> On Sunday, January 12, 2020, 1:45:16 PM PST, Andy Grove < >> [email protected]> wrote: >> >> Actually I managed to get past that error with an educated guess that if >> I >> created a BatchCreator class, it would automagically be picked up somehow. >> I'm now at the point where my RecordReader is being invoked! >> >> On Sun, Jan 12, 2020 at 2:03 PM Andy Grove <[email protected]> wrote: >> >> > Between reading the tutorial and copying and pasting code from the Kudu >> > storage plugin, I've been making reasonable progress with this but am I >> but >> > confused by one error I'm now hitting. >> > ExecutionSetupException: Failure finding OperatorCreator constructor for >> > config com.mydb.MyDbSubScan >> > Prior to this, Drill had called getSpecificScan and then called a few of >> > the methods on my subscan object. I wasn't sure what to return for >> > getOperatorType so just returned the kudu subscan operator type and I'm >> > wondering if the issue is related to that somehow? >> > >> > Thanks. >> > >> > >> > On Sat, Jan 11, 2020 at 10:13 PM Andy Grove <[email protected]> >> wrote: >> > >> >> Thank you both for the those responses. This is very helpful. I have >> >> ordered a copy of the book too. I'm using Drill 1.17.0. >> >> >> >> I'll take a look at the Jdbc Storage Plugin code and see if it would be >> >> feasible to add the logic I need there. In parallel, I've started >> >> implementing a new storage plugin. I'll be working on this more >> tomorrow >> >> and I'm sure I'll be back with more questions soon. >> >> >> >> Thanks again for your help! >> >> >> >> Andy. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Sat, Jan 11, 2020 at 6:03 PM Charles Givre <[email protected]> >> wrote: >> >> >> >>> HI Andy, >> >>> Thanks for your interest in Drill. I'm glad to see that Paul wrote >> you >> >>> back as well. I was going to say I thought the JDBC storage plugin >> did in >> >>> fact push down columns and filters to the source system. >> >>> >> >>> Also, what version of Drill are you using? >> >>> >> >>> Writing a storage plugin for Drill is not trivial and I'd definitely >> >>> recommend using the code from Paul's PR as that greatly simplifies >> things. >> >>> Here is a tutorial as well: >> >>> https://github.com/paul-rogers/drill/wiki/Create-a-Storage-Plugin >> >>> >> >>> If you need additional help, please let us know. >> >>> -- C >> >>> >> >>> >> >>> On Jan 11, 2020, at 5:57 PM, Andy Grove <[email protected]> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I'd like to use Apache Drill with a custom data source that supports a >> >>> subset of SQL. >> >>> >> >>> My goal is to have Drill push selection and predicates down to my data >> >>> source but the rest of the query processing should take place in >> Drill. >> >>> >> >>> I started out by writing a JDBC driver for the data source and >> >>> registering >> >>> that with Drill using the Jdbc Storage Plugin but it seems to just >> pass >> >>> the >> >>> whole query through to my data source, so that approach isn't going to >> >>> work >> >>> unless I'm missing something? >> >>> >> >>> Is there any way to configure the JDBC storage plugin to only push >> >>> certain >> >>> parts of the query to the data source? >> >>> >> >>> If this isn't a good approach, do I need to write a custom storage >> >>> plugin? >> >>> Can these be added on the classpath or would that require me >> maintaining >> >>> a >> >>> fork of the project? >> >>> >> >>> >> >>> >> >>> I appreciate any pointers anyone can give me. >> >>> >> >>> Thanks, >> >>> >> >>> Andy. >> >>> >> >>> >> >>> >> > >
