Answers in-line.


On Thu, May 28, 2015 at 8:08 AM, Andrew Brust <
[email protected]> wrote:

> Absolutely nothing to apologize for, and the below explanation is very
> helpful.
>

You are too kind.



> FWIW, I certainly understood that Hive's use of Calcite offered relatively
> little in the way of type flexibility/late binding, compare to Drill.  I
> get that Drill's entire raison d'etre is around this and never thought that
> Hive "had it too."  It was more a question of my being surprised that the
> query planners had any common technology at all.  I have never coded in
> Scala or Haskell, but I have coded plenty in C#, Pascal and VB, and I can
> apprecaiute the analogy just by having experience with one half of it.
>

ahh...

Well the problem that Calcite solves is really quite general.  The query as
you state it in SQL describes a data-flow graph (well, it does after
Calcite parses it).  That flow graph describes the operations you want done
in terms of logical operations, some of which might not have any actual
implementation.

The query optimization process involves the replacement of parts of the
data-flow with alternative forms.  Ultimately, the entire graph of logical
operations should be replaced by physical operators which have specific
implementations. Typically there are many (possibly billions) of
alternative representations of a query that can be derived by plausible
rewrites of the logical form.  The optimizer can estimate the cost of
different forms and the problem is to find a physical plan with a good
(low) cost even if finding the absolute best is very hard.

This is what Calcite does.  The transformations that you allow determine
what kind of query optimization you might do and determine what kinds of
queries you might accept in the first place.  If you replace the rules, you
change the goes-ins and goes-outs quite dramatically.  Changing the rules
does not, however, imply that the mechanism of transforming and optimizing
has to change.  It only changes the results.


> It's part of the reason I think Drill is so cool, and part of the reason
> why MapR did so well in one of Gigaom's last Sector Roadmaps.
>
> My "ponder question" is whether mainstream RDBMSes like Oracle and SQL
> Server will one day add Drill-like late binding functionality.
>

They might. It is a lot of work to do that, but they might well do that.

If they do, Drill should be considered a wild success. Within Apache,
copying, borrowing or flat-out imitation is consider very high praise
indeed.

Reply via email to