Github user tzulitai commented on the issue:
https://github.com/apache/flink/pull/5634
@juhoautio
Your concerns for this fix is quite correct, and is why this PR was closed
in the first place as there are a lot of ill-defined semantics introduced by
this.
Regarding
Github user juhoautio commented on the issue:
https://github.com/apache/flink/pull/5634
@tzulitai did you ever test your code? I tried it and it allowed watermarks
to proceed but apparently too aggressively, as it caused a lot of data loss.
I'm looking for a quick fix for
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/5634
Just saw a good comment from @EronWright
> I think the ideal would be that idleness would occur only for tail reads,
i.e. due to a timeout from `kafkaConsumer.poll(pollTimeout)`. In
Github user tweise commented on the issue:
https://github.com/apache/flink/pull/5634
This is a good proposal, it should also survive a general connector
refactor that will be necessary to address other code duplication. The Kinesis
ticket is
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/5634
@tweise @tzulitai
I would suggest to solve this the following way, which should be both
simple and cover our cases:
- We extend the current periodic watermark generators for
Github user tweise commented on the issue:
https://github.com/apache/flink/pull/5634
@tzulitai @StephanEwen the current idleness detection in the source context
isn't a replacement for what is required to deal with an inactive partition (or
Kinesis shard). When a connector subtask
Github user tzulitai commented on the issue:
https://github.com/apache/flink/pull/5634
I'm closing this PR now, as it seems the overall agreement is that we want
to implement this differently, or at least not touch the Kafka connector code
any more now.
---
Github user tzulitai commented on the issue:
https://github.com/apache/flink/pull/5634
A few points to mention:
1. @StephanEwen there is already common idleness detection implemented
within `SourceContext`s, see `StreamSourceContexts`. The idleness detection,
however, is
Github user tweise commented on the issue:
https://github.com/apache/flink/pull/5634
There was a related discussion on the mailing list; this and several other
features could be provided by a common connector framework. Such initiative is
a much larger effort though and it is not
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/5634
I would suggest to approach this in a different way.
1. Idleness detection is something that watermark generation benefits
from in general, not just in Kafka
2. Unless there is a
Github user EronWright commented on the issue:
https://github.com/apache/flink/pull/5634
I think the ideal would be that idleness would occur only for tail reads,
i.e. due to a timeout from `kafkaConsumer.poll(pollTimeout)`.In other
words, an intermittent connection issue would
11 matches
Mail list logo