[jira] [Closed] (FLINK-17800) RocksDB optimizeForPointLookup results in missing time windows

Yu Li (Jira) Fri, 26 Jun 2020 09:49:39 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yu Li closed FLINK-17800.
-------------------------
    Resolution: Fixed

Merged the new fix into master via
3516e37ae0aa4ee040b6844f336541315a455ce9
11d45135d85937edd16fb4f8f94ba71f5f794626
1718f50645ddc01d5e2e13cc5627bafe98191fa2

into release-1.11 via
b2e344a46c5d30ad46231d5c6a42bf09d9e8e559
33caa00e8df88565f022d4258148d09c90d9452b
7e1c83ddcf0e5e4417ccf25fd1d0facce9f30e0e

into release-1.10 via
de6f3aa7e5b2e4fcfbed4adeab12d4d519f1e6fb
3f8649e5c7bf731fac3cc5bfd3c5ed466f1dc561
9b45486ccce31ebef3f91dd4e6102efe3c6d51a3


Since all work done, closing the JIRA.

> RocksDB optimizeForPointLookup results in missing time windows
> --------------------------------------------------------------
>
>                 Key: FLINK-17800
>                 URL: https://issues.apache.org/jira/browse/FLINK-17800
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>    Affects Versions: 1.10.0, 1.10.1
>            Reporter: Yordan Pavlov
>            Assignee: Yun Tang
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.11.0, 1.10.2, 1.12.0
>
>         Attachments: MissingWindows.scala, MyMissingWindows.scala, 
> MyMissingWindows.scala
>
>
> +My Setup:+
> We have been using the _RocksDb_ option of _optimizeForPointLookup_ and 
> running version 1.7 for years. Upon upgrading to Flink 1.10 we started 
> receiving a strange behavior of missing time windows on a streaming Flink 
> job. For the purpose of testing I experimented with previous Flink version 
> and (1.8, 1.9, 1.9.3) and non of them showed the problem
>  
> A sample of the code demonstrating the problem is here:
> {code:java}
>  val datastream = env
>  .addSource(KafkaSource.keyedElements(config.kafkaElements, 
> List(config.kafkaBootstrapServer)))
>  val result = datastream
>  .keyBy( _ => 1)
>  .timeWindow(Time.milliseconds(1))
>  .print()
> {code}
>  
>  
> The source consists of 3 streams (being either 3 Kafka partitions or 3 Kafka 
> topics), the elements in each of the streams are separately increasing. The 
> elements generate increasing timestamps using an event time and start from 1, 
> increasing by 1. The first partitions would consist of timestamps 1, 2, 10, 
> 15..., the second of 4, 5, 6, 11..., the third of 3, 7, 8, 9...
>  
> +What I observe:+
> The time windows would open as I expect for the first 127 timestamps. Then 
> there would be a huge gap with no opened windows, if the source has many 
> elements, then next open window would be having a timestamp in the thousands. 
> A gap of hundred of elements would be created with what appear to be 'lost' 
> elements. Those elements are not reported as late (if tested with the 
> ._sideOutputLateData_ operator). The way we have been using the option is by 
> setting in inside the config like so:
> ??etherbi.rocksDB.columnOptions.optimizeForPointLookup=268435456??
> We have been using it for performance reasons as we have huge RocksDB state 
> backend.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-17800) RocksDB optimizeForPointLookup results in missing time windows

Reply via email to