The co-routine model sounds fitting into Streaming cases well.

I was thinking how should Enumerable interface work with streaming cases
but now I should also check Interpreter.


-Rui

On Tue, Dec 10, 2019 at 1:33 PM Julian Hyde <jh...@apache.org> wrote:

> The goal (or rather my goal) for the interpreter is to replace
> Enumerable as the quick, easy default convention.
>
> Enumerable is efficient but not that efficient (compared to engines
> that work on off-heap data representing batches of records). And
> because it generates java byte code there is a certain latency to
> getting a query prepared and ready to run.
>
> It basically implements the old Volcano query evaluation model. It is
> single-threaded (because all work happens as a result of a call to
> 'next()' on the root node) and cannot handle branching data-flow
> graphs (DAGs).
>
> The Interpreter operates uses a co-routine model (reading from queues,
> writing to queues, and yielding when there is no work to be done) and
> therefore could be more efficient than enumerable in a single-node
> multi-core system. Also, there is little start-up time, which is
> important for small queries.
>
> I would love to add another built-in convention that uses Arrow as
> data format and generates co-routines for each operator. Those
> co-routines could be deployed in a parallel and/or distributed data
> engine.
>
> Julian
>
> On Tue, Dec 10, 2019 at 3:47 AM Zoltan Farkas
> <zolyfar...@yahoo.com.invalid> wrote:
> >
> > What is the ultimate goal of the Calcite Interpreter?
> >
> > To provide some context, I have been playing around with calcite + REST
> (see https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest <
> https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest> for
> detail of my experiments)
> >
> >
> > —Z
> >
> > > On Dec 9, 2019, at 9:05 PM, Julian Hyde <jh...@apache.org> wrote:
> > >
> > > Yes, virtualization is one of Calcite’s goals. In fact, when I created
> Calcite I was thinking about virtualization + in-memory materialized views.
> Not only the Spark convention but any of the “engine” conventions (Drill,
> Flink, Beam, Enumerable) could be used to create a virtual query engine.
> > >
> > > See e.g. a talk I gave in 2013 about Optiq (precursor to Calcite)
> https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
> <
> https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework
> >.
> > >
> > > Julian
> > >
> > >
> > >
> > >> On Dec 9, 2019, at 2:29 PM, Muhammad Gelbana <mgelb...@apache.org>
> wrote:
> > >>
> > >> I recently contacted one of the active contributors asking about the
> > >> purpose of the project and here's his reply:
> > >>
> > >> From my understanding, Quicksql is a data virtualization platform. It
> can
> > >>> query multiple data sources altogether and in a distributed way;
> Say, you
> > >>> can write a SQL with a MySql table join with an Elasticsearch table.
> > >>> Quicksql can recognize that, and then generate Spark code, in which
> it will
> > >>> fetch the MySQL/ES data as a temporary table separately, and then
> join them
> > >>> in Spark. The execution is in Spark so it is totally distributed.
> The user
> > >>> doesn't need to aware of where the table is from.
> > >>>
> > >>
> > >> I understand that the Spark convention Calcite has attempts to
> achieve the
> > >> same goal, but it isn't fully implemented yet.
> > >>
> > >>
> > >> On Tue, Oct 29, 2019 at 9:43 PM Julian Hyde <jh...@apache.org> wrote:
> > >>
> > >>> Anyone know anything about Quicksql? It seems to be quite a popular
> > >>> project, and they have an internal fork of Calcite.
> > >>>
> > >>> https://github.com/Qihoo360/ <https://github.com/Qihoo360/>
> > >>>
> > >>>
> > >>>
> https://github.com/Qihoo360/Quicksql/tree/master/analysis/src/main/java/org/apache/calcite
> > >>> <
> > >>>
> https://github.com/Qihoo360/Quicksql/tree/master/analysis/src/main/java/org/apache/calcite
> > >>>>
> > >>>
> > >>> Julian
> > >>>
> > >>>
> > >
> >
>

Reply via email to