[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-04-13 Thread tzulitai
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/5634 @juhoautio Your concerns for this fix is quite correct, and is why this PR was closed in the first place as there are a lot of ill-defined semantics introduced by this. Regarding

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-04-13 Thread juhoautio
Github user juhoautio commented on the issue: https://github.com/apache/flink/pull/5634 @tzulitai did you ever test your code? I tried it and it allowed watermarks to proceed but apparently too aggressively, as it caused a lot of data loss. I'm looking for a quick fix for

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-08 Thread StephanEwen
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5634 Just saw a good comment from @EronWright > I think the ideal would be that idleness would occur only for tail reads, i.e. due to a timeout from `kafkaConsumer.poll(pollTimeout)`. In

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-07 Thread tweise
Github user tweise commented on the issue: https://github.com/apache/flink/pull/5634 This is a good proposal, it should also survive a general connector refactor that will be necessary to address other code duplication. The Kinesis ticket is

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-07 Thread StephanEwen
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5634 @tweise @tzulitai I would suggest to solve this the following way, which should be both simple and cover our cases: - We extend the current periodic watermark generators for

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-07 Thread tweise
Github user tweise commented on the issue: https://github.com/apache/flink/pull/5634 @tzulitai @StephanEwen the current idleness detection in the source context isn't a replacement for what is required to deal with an inactive partition (or Kinesis shard). When a connector subtask

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-07 Thread tzulitai
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/5634 I'm closing this PR now, as it seems the overall agreement is that we want to implement this differently, or at least not touch the Kafka connector code any more now. ---

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-06 Thread tzulitai
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/5634 A few points to mention: 1. @StephanEwen there is already common idleness detection implemented within `SourceContext`s, see `StreamSourceContexts`. The idleness detection, however, is

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-06 Thread tweise
Github user tweise commented on the issue: https://github.com/apache/flink/pull/5634 There was a related discussion on the mailing list; this and several other features could be provided by a common connector framework. Such initiative is a much larger effort though and it is not

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-06 Thread StephanEwen
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/5634 I would suggest to approach this in a different way. 1. Idleness detection is something that watermark generation benefits from in general, not just in Kafka 2. Unless there is a

[GitHub] flink issue #5634: [FLINK-5479] [kafka] Idleness detection for periodic per-...

2018-03-05 Thread EronWright
Github user EronWright commented on the issue: https://github.com/apache/flink/pull/5634 I think the ideal would be that idleness would occur only for tail reads, i.e. due to a timeout from `kafkaConsumer.poll(pollTimeout)`.In other words, an intermittent connection issue would