sounds like a promising technology to prototype, Marin. Thanks for the note.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ On 5/10/16, 12:10 AM, "Martin Desruisseaux" <[email protected]> wrote: >Hello all > >Today was the first day at ApacheCon North America. Among the various >presentation, one attracted especially my attention: > > > Streaming SQL with Apache Calcite > >We mentioned in previous emails the possibility to use Calcite as the >SQL parser for our DataStores like ShapeFiles. The presentation that I >saw today increase Calcite attractiveness, by opening possibilities to >couple such DataStores with e.g. SensorML. > >The presentations reminded some SQL advantages, include: to tell what we >want rather than how to get it (we let the query optimizer figure out >the "how"), and to allow some changes on data structures and indexes >without impacting the SQL statements. > >Apache Calcite propose an extension to the SQL language: the "SELECT >STREAM" statement. Compare a classical statement in which "Sensors" is a >table: > > SELECT * FROM "Sensors" WHERE altitude < 20; > >Now consider a case where "Temperature" is a stream. Contrarily to the >above classical case, the query below never terminates if new >temperature data are continuously arriving: > > SELECT STREAM * FROM "Temperature" WHERE value > 20; > >(we can see streams as "Data in flight" and tables as "Data at rest") > >Calcite can use stream as a table and table as a stream. Actually >"Temperatures" is both - where to actually find the data is up to the >system. An example of the advantage of using both as stream and as table >is to get the temperature that are greater than the average temperature >of previous year. > >It is possible to use JOIN between stream and table (e.g. between >"Temperature" and "Sensors"); the result is a stream. The table may be >changed during stream execution. But JOIN between two streams is more >challenging. > >Calcite provides Window functions that can be used together with GROUP >BY for computing values based on neighbouring rows. Example: "For every >records, emit the average for the surrounding T seconds". > > Martin > >
