[
https://issues.apache.org/jira/browse/FLINK-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu Li closed FLINK-17800.
-------------------------
Resolution: Fixed
Merged the new fix into master via
3516e37ae0aa4ee040b6844f336541315a455ce9
11d45135d85937edd16fb4f8f94ba71f5f794626
1718f50645ddc01d5e2e13cc5627bafe98191fa2
into release-1.11 via
b2e344a46c5d30ad46231d5c6a42bf09d9e8e559
33caa00e8df88565f022d4258148d09c90d9452b
7e1c83ddcf0e5e4417ccf25fd1d0facce9f30e0e
into release-1.10 via
de6f3aa7e5b2e4fcfbed4adeab12d4d519f1e6fb
3f8649e5c7bf731fac3cc5bfd3c5ed466f1dc561
9b45486ccce31ebef3f91dd4e6102efe3c6d51a3
Since all work done, closing the JIRA.
> RocksDB optimizeForPointLookup results in missing time windows
> --------------------------------------------------------------
>
> Key: FLINK-17800
> URL: https://issues.apache.org/jira/browse/FLINK-17800
> Project: Flink
> Issue Type: Bug
> Components: Runtime / State Backends
> Affects Versions: 1.10.0, 1.10.1
> Reporter: Yordan Pavlov
> Assignee: Yun Tang
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.11.0, 1.10.2, 1.12.0
>
> Attachments: MissingWindows.scala, MyMissingWindows.scala,
> MyMissingWindows.scala
>
>
> +My Setup:+
> We have been using the _RocksDb_ option of _optimizeForPointLookup_ and
> running version 1.7 for years. Upon upgrading to Flink 1.10 we started
> receiving a strange behavior of missing time windows on a streaming Flink
> job. For the purpose of testing I experimented with previous Flink version
> and (1.8, 1.9, 1.9.3) and non of them showed the problem
>
> A sample of the code demonstrating the problem is here:
> {code:java}
> val datastream = env
> .addSource(KafkaSource.keyedElements(config.kafkaElements,
> List(config.kafkaBootstrapServer)))
> val result = datastream
> .keyBy( _ => 1)
> .timeWindow(Time.milliseconds(1))
> .print()
> {code}
>
>
> The source consists of 3 streams (being either 3 Kafka partitions or 3 Kafka
> topics), the elements in each of the streams are separately increasing. The
> elements generate increasing timestamps using an event time and start from 1,
> increasing by 1. The first partitions would consist of timestamps 1, 2, 10,
> 15..., the second of 4, 5, 6, 11..., the third of 3, 7, 8, 9...
>
> +What I observe:+
> The time windows would open as I expect for the first 127 timestamps. Then
> there would be a huge gap with no opened windows, if the source has many
> elements, then next open window would be having a timestamp in the thousands.
> A gap of hundred of elements would be created with what appear to be 'lost'
> elements. Those elements are not reported as late (if tested with the
> ._sideOutputLateData_ operator). The way we have been using the option is by
> setting in inside the config like so:
> ??etherbi.rocksDB.columnOptions.optimizeForPointLookup=268435456??
> We have been using it for performance reasons as we have huge RocksDB state
> backend.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)