Re: Filter push

Dan Di Spaltro Thu, 09 Oct 2014 09:25:14 -0700

This sounds great, Julian.

Thanks for the help,


-Dan

On Thu, Oct 9, 2014 at 6:27 AM, Julian Hyde <[email protected]> wrote:

> I was wondering whether a simpler SPI, that allows you to get results
> from a Table without generating code, would help. I just logged
> https://issues.apache.org/jira/browse/OPTIQ-436:
>
> * CursorableTable is an optional interface that can be implemented by
> any Table that allows you to get the results directly, without code
> generation, and without creating a TableAccessRel or similar. It
> returns a Cursor, which is similar to a JDBC ResultSet but much
> simpler to implement, and is more efficient than an Iterator or
> Enumerable.
>
> * ProjectableCursorableTable goes further, and allows Calcite to
> specify a list of projected fields and a list of filters. The cursor
> must implement the projects, but it can choose which filters it is
> able to implement.
>
> Would these interfaces make it easier to implement your RocksDB interface?
>
> Julian
>
>
> On Tue, Oct 7, 2014 at 6:27 PM, Dan Di Spaltro <[email protected]>
> wrote:
> > Thanks for the response.  Here is my attempt to clearly explain the only
> > push-down/optimization/shortcut (whatever it is) I am trying to do.
> >
> > I have two physical operations that the db api can do, get and scan
> > (specifying a start). Since it is a simple key value store I am storing
> > keys in a hierarchical fashion 1 level deep, as mentioned in the previous
> > thread.
> >
> > I want to do one simple optimization, and that is if you specify what I
> > deem is a "primary key" in the filter either through a between, in, OR's
> or
> > whatever, I want to tell the physical db scan to seek.  That's really
> all I
> > am trying to do outside of all the stuff Optiq gives me.
> >
> > I have a table scan that takes a start key and an end, and a list of
> > projected columns (since it's only known at read time).  That produces an
> > enumerable which maps directly to the physical iterator.  I can't quite
> > work out in my head how to introspect the columns, figure out if it's one
> > of the primary columns, add more metadata to the Scan call, then perform
> > the normal operation.  That's where I am most getting tripped up.
> >
> > On Tue, Oct 7, 2014 at 10:21 AM, Vladimir Sitnikov <
> > [email protected]> wrote:
> >
> >> Dan,
> >>
> >> >As always, a good example helps
> >>
> >> Did you succeed with workable "select * from rocksdb_table"?
> >> Can you share your code so conversation can become more specific?
> >>
> >
> > Yes I did, in a couple different increments.  Following the CSV type
> model,
> > minus any filter push down, but with projection.  The more mongo-like
> > structure where we define my own convention, but that didn't really get
> to
> > what I wanted.
> >
> > It's just tough since I am not writing something that is generally
> useful,
> > but I can try to put something up.
> >
> >
> >>
> >> The calcite.debug code that you've posted recently has no rocksdb calls,
> >> thus it looks wrong.
> >>
> >
> > I might have posted the wrong one, I've been playing with a lot of
> > examples...
> >
> >
> >> >Do you think this would make more sense to follow in the footsteps of
> the
> >> >spark model, since it's more about generating code that is run via
> spark
> >> >RDD's vs translating queries from one language to another (in the case
> of
> >> >Mongo/splunk)?
> >>
> >> Mongo/spark have their own query languages, thus those adapters
> >> "translating
> >> queries from one language to another" stuff to push more
> >> conditions/expressions to the database engine.
> >>
> >
> > I guess I equated to Spark being "normal" code vs string translation.
> Like
> > filter conditions per row operate in much of the same way as in
> > reflectionschema,
> >
> >
> >>
> >> As far as I understand, rocksdb speaks just java (there is no such
> thing as
> >> rocksdb-language), thus I would suggest going with "translate to java
> calls
> >> (rocksdb API)" approach.
> >
> >
> > I tried to address that above.
> >
> >
> >>
> >> You should have some good kind of aim.
> >> "push down filters to rocksdb" is a wrong aim. Well, it might be a good
> aim
> >> if you are Julian and you know what you are doing, but it does not seem
> to
> >> be the case.
> >> "make Calcite use rocks.get() api to fetch row by key given in this
> kind of
> >> SQL" is a good one.
> >> "display all rows from rocksdb as a table" is also a good aim.
> >>
> >> The easiest approach from my point of view, is to use Calcite as an
> >> intermediate framework that translates SQL to _appropriate_ calls of
> your
> >> storage engine (see Julians approach earlier in this thread).
> >> Calcite can glue together the iterations and fill in missing parts. For
> >> instance, you can have "group by" implemented for free.
> >>
> >> Does that make sense?
> >>
> >> --
> >> Vladimir
> >>
> >
> >
> >
> > --
> > Dan Di Spaltro
>



-- 
Dan Di Spaltro

Re: Filter push

Reply via email to