reswqa opened a new pull request, #21415: URL: https://github.com/apache/flink/pull/21415
## What is the purpose of the change *In order to solve the problem that data cannot be read from the disk correctly after failover, we changed the calculation logical of the buffer's readable state in [FLINK-29238](https://issues.apache.org/jira/browse/FLINK-29238). Buffers that are greater than consumingOffset and have been released can be pre-load from file. However, the update of consumingOffset is asynchronous, If it lags behind the actual consumption progress, the buffer will have a chance to be load from the disk again. IMO, we can record the consumed status of buffer by each consumer in the InternalRegion. Only the buffers that have not been consumed and have been released will be considered as readable. In the case of failover, a new consumerId will be generated, so all buffers will be considered as unconsumed and can be correctly read from the disk too.* ## Brief change log - *`HsFileDataIndex` maintains the buffer's consumed state of each consumer.* - *`ReadableRegion` is no longer decided by consuming offset.* ## Verifying this change This change is already covered by existing tests ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
