Hello Lei,

Thanks for the proposal. I've just made a quick pass over it and there is a
question I have:

The session windows are defined per key, i.e. does that mean that each
incoming record of the key can dynamically change the gap of the window?
For example, say you have the following record for the same key coming in
order, where the first time is the timestamp of the record, and the second
value is the extracted gap value:

(10, 10), (19, 5), ...


When we receive the first record at time 10, the gap is extracted as 10,
and hence the window will be expired at 20 if no other record is received.
When we receive the second record at time 19, the gap is modified to 5, and
hence the window will be expired at 24 if no other record is received.


If that's the case, I'm wondering how out-of-order data can be handled
then, consider this stream:

(10, 10), (19, 5), (15, 3) ...

I.e. you received a late record indicating at timestamp 15, which shorten
the gap to 3. It means that the window SHOULD actually be expired at 18,
and hence the next record (19, 5) should be for a new session already.
Today Streams session window implementation does not do "window split", so
have you thought about how this can be extended?

Also since in your proposal each session window's gap value would be
different, we need to store this value along with each record then, how
would we store it, and what would be the upgrade path if it is not a
compatible change on disk storage etc?



Guozhang



On Wed, Aug 22, 2018 at 10:05 AM, Lei Chen <ley...@gmail.com> wrote:

> Hi All,
>
> I created a KIP to add dynamic gap session window support to Kafka Streams
> DSL.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 362%3A+Support+dynamic+gap+session+window
>
> Please take a look,
>
> Thanks,
> Lei
>



-- 
-- Guozhang

Reply via email to