[ 
https://issues.apache.org/jira/browse/FLINK-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ori Popowski updated FLINK-16929:
---------------------------------
    Description: 
 We have a Flink job which keyBys session ID (sId), and uses a session window 
with 30 minutes gap:
{code:java}
inputStream
    .keyBy(keySelector)
    .window(EventTimeSessionWindows.withGap(Time.minutes(30)))
    .allowedLateness(Time.seconds(0L))
{code}
This Flink job reads from Kinesis stream.

Lately (I suspect after upgrading from 1.5.4 to 1.9.1) we get too many 
sessions, with gaps of several seconds (instead of 30 minutes).

We have no idea why it's happening and suspect a Flink bug or a state backend 
bug (we use RocksDB).

I haven't found any indication in the logs except for some read throughput 
warnings which were resolved by a backoff.

Attached is a table of derived sessions, and then the raw events

*Sessions*

!image-2020-04-01-19-50-06-326.png|width=896,height=599!

 

*Events*

 

!image-2020-04-01-19-50-23-954.png|width=280,height=617!   

 

 

 

 

  was:
 We have a Flink job which keyBys session ID (sId), and uses a session window 
with 30 minutes gap:
{code:java}
inputStream
    .keyBy(keySelector)
    .window(EventTimeSessionWindows.withGap(Time.minutes(30)))
    .allowedLateness(Time.seconds(0L))
{code}
This Flink job reads from Kinesis stream.

Lately (I suspect after upgrading from 1.5.4 to 1.9.1) we get too many 
sessions, with gaps of several seconds (instead of 30 minutes).

We have no idea why it's happening and suspect a Flink bug or a state backend 
bug (we use RocksDB).

I haven't found any indication in the logs except for some read throughput 
warnings which were resolved by a backoff.

Attached is a table of derived sessions, and then the raw events

*Sessions*

!image-2020-04-01-19-50-06-326.png!

 

*Events*

 

!image-2020-04-01-19-50-23-954.png!  


> Session Window produces sessions randomly
> -----------------------------------------
>
>                 Key: FLINK-16929
>                 URL: https://issues.apache.org/jira/browse/FLINK-16929
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.9.1
>            Reporter: Ori Popowski
>            Priority: Major
>         Attachments: image-2020-04-01-19-50-23-954.png, screenshot-1.png
>
>
>  We have a Flink job which keyBys session ID (sId), and uses a session window 
> with 30 minutes gap:
> {code:java}
> inputStream
>     .keyBy(keySelector)
>     .window(EventTimeSessionWindows.withGap(Time.minutes(30)))
>     .allowedLateness(Time.seconds(0L))
> {code}
> This Flink job reads from Kinesis stream.
> Lately (I suspect after upgrading from 1.5.4 to 1.9.1) we get too many 
> sessions, with gaps of several seconds (instead of 30 minutes).
> We have no idea why it's happening and suspect a Flink bug or a state backend 
> bug (we use RocksDB).
> I haven't found any indication in the logs except for some read throughput 
> warnings which were resolved by a backoff.
> Attached is a table of derived sessions, and then the raw events
> *Sessions*
> !image-2020-04-01-19-50-06-326.png|width=896,height=599!
>  
> *Events*
>  
> !image-2020-04-01-19-50-23-954.png|width=280,height=617!   
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to