Hi Julian Feinauer,

Do share your thoughts on MATCH_RECOGNIZE operator suitability, 
http://mail-archives.apache.org/mod_mbox/calcite-dev/201807.mbox/%[email protected]%3e

Cheers,
Mark.

> On 22 Oct 2018, at 02:24, Julian Feinauer <[email protected]> 
> wrote:
> 
> Dear calcite devs,
> 
> I follow the project for a long time and love how calcite made it possible to 
> use SQL everywhere (have done several sql interfaces on top of specific file 
> formats myself). I also like the strong support for streaming SQL.
> 
> The reason I'm writing this email is not only to give the project some love 
> but because we are thinking about a SQL "extension" which I think is not so 
> specific but could serve others as well in different use cases.
> 
> In detail, we are working with Streams of Data from Devices (think of 
> industry 4.0). We read data, e.g., from PLCs (using the (incubating) Apache 
> PLC4X project where I contribute) and do analytics on them. The analysis 
> which are done there are pretty similar when working with traces from tests, 
> e.g., automotive test drives or from related industries. What all these 
> streams have in  common is
> * usually ordered by time
> * elements of different groups of signals ("rows" from "tables") arrive 
> ordered by time but not with equal timestamps, e.g., time each second, other 
> quantities much more frequent
> * "natural" join for these signal groups ("tables") is some kind of 
> interpolation (sample and hold, linear interpolation, splinces, ...) with 
> respect to (event-)time
> * In some cases signal types are not known and can only be guessed based on 
> first value, e.g., on CAN there is no strict notion of "double" or "integer" 
> channels but rather there are integer base values + a conversion formula 
> (like a x + b) + possible lookup tables for "other" values (SNA, NULL, 
> DISABLED, ...)
> 
> On the other hand the analysis we like to perform are often timestamps
> * get timestamps where a condition becomes true
>  * boolean value toggled
>  * numeric value is above / below threshold
>  * signal change rate is above / below threshold
>  * ...
> * get the values of certain signals at the point in time when a condition 
> becomes true (see above)
> * get windows based on conditions
>  * while signal is true
>  * while value above ...
>  * ...
> * Do aggregations on signals in the mentioned windows
> 
> Parts of this could done in most SQL dialects (I'm no expert for the standard 
> but in Postgres one could use LAG and partitions) but this is not efficient 
> and not all of the above could be done with that.
> So we think about an extension (or a dialect) for "traces" or "time series" 
> which has a syntax that is slightly extended to allow such queries as stated 
> above.
> 
> To give you an example of what such an extension could look like:
> 
> ```
> SELECT start(), end(), MAX(current) FROM s7://127.0.0.1/0/0 WHILE 
> cycle_in_progress = TRUE
> SELECT timestamp, current AS start_current FROM s7://127.0.0.1/0/0 WHERE 
> cycle_in_progress = TRUE TRIGGER ON_BECOME_TRUE
> SELECT timestamp, current AS start_current FROM s7://127.0.0.1/0/0 WHERE 
> cycle_in_progress = TRUE TRIGGER ON_BECOME_TRUE
> ```
> 
> Why am I bothering you with this?
> Well, first, you are experts and I would love to get some feedback on 
> thoughts of that.
> But, most important, I am thinking about writing (yet another) SQL parser 
> with slight extensions and would then have to care for a "runtime" which 
> would be partially similar (in functionality, not in maturity or 
> sophistication) to Calcites Enumerable-Trait. So I was thinking whether there 
> is a way to make all of this work "on top" of Calcite (custom RelNodes and an 
> extension to the parser) but I'm unsure about that as some of the internals 
> of Calcite are tied very specifically to Sql... like, e.g., SqlToRelConverter.
> Do you have any ideas on how one would be able to implement this "minimaly 
> invasive" on top of Calcite and whether this is possible "ex-situ" or if this 
> should then be done in the same codebase (e.g. a subproject) as it would need 
> some changes near Calcites core?
> 
> Please excuse this rather long email but I would really appreciate any 
> answers, comments or suggestions.
> 
> Best
> Julian

Reply via email to