Hi Cody,

the monotone (or quasi-monotone) attribute required for grouping a stream
in Calcite is a generalization of the timestamp/watermark concept in Flink.
The timestamps in Flink are quasi-monotone, i.e., they are increasing but
might be slightly out of order. This out-of-orderness is controlled by the
watermarks.
In order to compute consistent results using event-time you need a
timestamp in Flink as well.

Since Calcite envisions a StreamSQL that is fully compatible and integrated
with SQL on static data sets, it generalizes the timestamp concept to
(quasi-)monotone attributes, i.e., you could also compute a windowed query
over a static table that is sorted on an arbitrary field.

The keyBy attribute in Flink's DataStream API refers to an additional
attribute on which is grouped. If you have the query:

SELECT STREAM TUMBLE_END(rowtime, INTERVAL '1' HOUR) AS rowtime,
  productId,
  COUNT(*) AS c,
  SUM(units) AS unitsFROM OrdersGROUP BY TUMBLE(rowtime, INTERVAL '1'
HOUR), productId;

The grouping would be expressed in Flink as

stream.keyBy("productId").timeWindow(Time.hours(1)).apply(...)

Here, productId is the partitioning key, and rowTime is the implicit
timestamp in Flink (timestamps are metadata in Flink but actual
columns in Calcite's model).

I hope that clarifies the relationship of the monotone attributes and
Flink's timestamp / watermark concept.

Best, Fabian




2016-06-14 4:43 GMT+02:00 Cody Innowhere <[email protected]>:

> Hi guys,
> I went through Stream SQL doc on calcite website and have a little question
> about grouping. calcite's grouping requires that a table column must be
> monotonic or quasi-monotonic while in real world cases we don't necessarily
> have such fields in streams, unless we use a virtual field, say, the emit
> timestamp of each stream msg. A similar case would be flink stream API, it
> has a KeyedStream, which is kind of groupBy, but it does not actually
> require the keyed field to be monotonic. So in such cases, how do you plan
> to implement this?
>
> Also I noticed that calcite 1.8.0 has been released, there seems to be no
> updates regarding Stream SQL in this release, do you have a plan or roadmap
> on Stream SQL?
>
> Thanks~
>

Reply via email to