[ 
https://issues.apache.org/jira/browse/CALCITE-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-1237:
---------------------------------
    Description: 
A session window is a collection of rows whose key values, when sorted, have a 
gap of at most N.

Q1. Should "at most" be "less than"?

The key type can be any type that has a minus operator, that is, numeric and 
date-time.

I propose the following syntax: {{session(key [, ...]*, interval)}}. For 
example:

{code}
select stream session(rowtime, productId, interval '5' second),
  productId, count(*) as c
from Orders
group by session(rowtime, productId, interval '5' second),
  productId
{code}

to find bursts of orders for the same product where consecutive orders are no 
more than 5 seconds apart.

The first key column {{rowtime}} defines the session and must be of 
numeric/date-time type, and must have monotonicity or similar in order for the 
query to make progress; the other key columns (in this case {{productId}}) can 
be of any type; the last column is the interval, and must be constant.

The {{session}} function returns the key value at the start of the window. 
Unlike the {{tumble}} function, each row belongs to precisely one window. But 
{{session}} is not a true function, because its value depends on the records 
flowing in the stream.

Q2. If {{session}} is used, should we allow order-dependent aggregate functions 
such as {{first_value}}?

Q3. Should we allow {{session}} as a windowed aggregate function?

  was:
A session window is a collection of rows whose key values, when sorted, have a 
gap of at most N.

(Should "at most" be "less than"?)

The key type can be any type that has a minus operator, that is, numeric and 
date-time.

I propose the following syntax: {{session(key [, ...]*, interval)}}. For 
example:

{code}
select stream session(rowtime, productId, interval '5' second),
  productId, count(*) as c
from Orders
group by session(rowtime, productId, interval '5' second),
  productId
{code}

to find bursts of orders for the same product where consecutive orders are no 
more than 5 seconds apart.


> Session windows for streaming SQL
> ---------------------------------
>
>                 Key: CALCITE-1237
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1237
>             Project: Calcite
>          Issue Type: Bug
>          Components: stream
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>
> A session window is a collection of rows whose key values, when sorted, have 
> a gap of at most N.
> Q1. Should "at most" be "less than"?
> The key type can be any type that has a minus operator, that is, numeric and 
> date-time.
> I propose the following syntax: {{session(key [, ...]*, interval)}}. For 
> example:
> {code}
> select stream session(rowtime, productId, interval '5' second),
>   productId, count(*) as c
> from Orders
> group by session(rowtime, productId, interval '5' second),
>   productId
> {code}
> to find bursts of orders for the same product where consecutive orders are no 
> more than 5 seconds apart.
> The first key column {{rowtime}} defines the session and must be of 
> numeric/date-time type, and must have monotonicity or similar in order for 
> the query to make progress; the other key columns (in this case 
> {{productId}}) can be of any type; the last column is the interval, and must 
> be constant.
> The {{session}} function returns the key value at the start of the window. 
> Unlike the {{tumble}} function, each row belongs to precisely one window. But 
> {{session}} is not a true function, because its value depends on the records 
> flowing in the stream.
> Q2. If {{session}} is used, should we allow order-dependent aggregate 
> functions such as {{first_value}}?
> Q3. Should we allow {{session}} as a windowed aggregate function?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to