The co-routine model sounds fitting into Streaming cases well. I was thinking how should Enumerable interface work with streaming cases but now I should also check Interpreter.
-Rui On Tue, Dec 10, 2019 at 1:33 PM Julian Hyde <jh...@apache.org> wrote: > The goal (or rather my goal) for the interpreter is to replace > Enumerable as the quick, easy default convention. > > Enumerable is efficient but not that efficient (compared to engines > that work on off-heap data representing batches of records). And > because it generates java byte code there is a certain latency to > getting a query prepared and ready to run. > > It basically implements the old Volcano query evaluation model. It is > single-threaded (because all work happens as a result of a call to > 'next()' on the root node) and cannot handle branching data-flow > graphs (DAGs). > > The Interpreter operates uses a co-routine model (reading from queues, > writing to queues, and yielding when there is no work to be done) and > therefore could be more efficient than enumerable in a single-node > multi-core system. Also, there is little start-up time, which is > important for small queries. > > I would love to add another built-in convention that uses Arrow as > data format and generates co-routines for each operator. Those > co-routines could be deployed in a parallel and/or distributed data > engine. > > Julian > > On Tue, Dec 10, 2019 at 3:47 AM Zoltan Farkas > <zolyfar...@yahoo.com.invalid> wrote: > > > > What is the ultimate goal of the Calcite Interpreter? > > > > To provide some context, I have been playing around with calcite + REST > (see https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest < > https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroCalciteRest> for > detail of my experiments) > > > > > > —Z > > > > > On Dec 9, 2019, at 9:05 PM, Julian Hyde <jh...@apache.org> wrote: > > > > > > Yes, virtualization is one of Calcite’s goals. In fact, when I created > Calcite I was thinking about virtualization + in-memory materialized views. > Not only the Spark convention but any of the “engine” conventions (Drill, > Flink, Beam, Enumerable) could be used to create a virtual query engine. > > > > > > See e.g. a talk I gave in 2013 about Optiq (precursor to Calcite) > https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework > < > https://www.slideshare.net/julianhyde/optiq-a-dynamic-data-management-framework > >. > > > > > > Julian > > > > > > > > > > > >> On Dec 9, 2019, at 2:29 PM, Muhammad Gelbana <mgelb...@apache.org> > wrote: > > >> > > >> I recently contacted one of the active contributors asking about the > > >> purpose of the project and here's his reply: > > >> > > >> From my understanding, Quicksql is a data virtualization platform. It > can > > >>> query multiple data sources altogether and in a distributed way; > Say, you > > >>> can write a SQL with a MySql table join with an Elasticsearch table. > > >>> Quicksql can recognize that, and then generate Spark code, in which > it will > > >>> fetch the MySQL/ES data as a temporary table separately, and then > join them > > >>> in Spark. The execution is in Spark so it is totally distributed. > The user > > >>> doesn't need to aware of where the table is from. > > >>> > > >> > > >> I understand that the Spark convention Calcite has attempts to > achieve the > > >> same goal, but it isn't fully implemented yet. > > >> > > >> > > >> On Tue, Oct 29, 2019 at 9:43 PM Julian Hyde <jh...@apache.org> wrote: > > >> > > >>> Anyone know anything about Quicksql? It seems to be quite a popular > > >>> project, and they have an internal fork of Calcite. > > >>> > > >>> https://github.com/Qihoo360/ <https://github.com/Qihoo360/> > > >>> > > >>> > > >>> > https://github.com/Qihoo360/Quicksql/tree/master/analysis/src/main/java/org/apache/calcite > > >>> < > > >>> > https://github.com/Qihoo360/Quicksql/tree/master/analysis/src/main/java/org/apache/calcite > > >>>> > > >>> > > >>> Julian > > >>> > > >>> > > > > > >