GitHub user tzulitai opened a pull request:
https://github.com/apache/flink/pull/4967
[FLINK-8001] [kafka] Prevent PeriodicWatermarkEmitter from violating â¦
## What is the purpose of the change
Prior to this PR, a bug exists such that if a Kafka consumer subtask
initially marks itself as idle because it didn't have any partitions to
subscribe to, that idleness status will be violated when the
`PeriodicWatermarkEmitter` is fired.
The problem is that the PeriodicWatermarkEmitter incorrectly yields a
`Long.MAX_VALUE` watermark even when there are no partitions to subscribe to.
This commit fixes this by additionally ensuring that the aggregated watermark
in the `PeriodicWatermarkEmitter` is an effective one (i.e., is really
aggregated from some partition).
## Brief change log
Only contains one commit, which addresses the above described issue.
## Verifying this change
A new test is added to `AbstractFetcherTest`, that verifies when there is
no subscribed partitions and the periodic watermark emitter fires, no watermark
is emitted.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: no
- The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? n/a
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tzulitai/flink FLINK-8001
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4967.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4967
----
commit d81d685d57625f9aac44f721f4dc993432ef0399
Author: Tzu-Li (Gordon) Tai <[email protected]>
Date: 2017-11-07T11:35:33Z
[FLINK-8001] [kafka] Prevent PeriodicWatermarkEmitter from violating IDLE
status
Prior to this commit, a bug exists such that if a Kafka consumer subtask
initially marks itself as idle because it didn't have any partitions to
subscribe to, that idleness status will be violated when the
PeriodicWatermarkEmitter is fired.
The problem is that the PeriodicWatermarkEmitter incorrecty yields a
Long.MAX_VALUE watermark even when there are no partitions to subscribe
to. This commit fixes this by additionally ensuring that the aggregated
watermark in the PeriodicWatermarkEmitterr is an effective one (i.e., is
really aggregated from some partition).
----
---