[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-16 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197195#comment-17197195
 ] 

Arvid Heise commented on FLINK-19109:
-

Merged in release-1.10: 05ff71c813650f65604d960978f93ba82f09a48d

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-14 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195670#comment-17195670
 ] 

Arvid Heise commented on FLINK-19109:
-

Since an operator can also register processing time timers without implementing 
{{ProcessingTimeCallback}}, we need to disable any chaining in 1.10.

Best, we could do is to add an {{enableChaining}}/{{forceChaining}} to 
{{DataStream}} to allow expert users to force the old behavior on their own 
risk. [~aljoscha], any opinion on this? (Might be something to offer in general)

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-11 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194350#comment-17194350
 ] 

Arvid Heise commented on FLINK-19109:
-

{{Added a PR. However, I'm quite positive that we need to disallow all chaining 
of ProcessingTimeCallback after {{ContinousFileReaderOperator. For now I only 
disabled them in event time, but it might be also relevant for 
processing/ingestion time.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-10 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193662#comment-17193662
 ] 

Arvid Heise commented on FLINK-19109:
-

Yes, it's possible to contain the fix to just event time characteristics. I'll 
prepare a PR soonish.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-10 Thread David Anderson (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193629#comment-17193629
 ] 

David Anderson commented on FLINK-19109:


I like [~pnowojski]'s idea of disabling chaining for the 
{{ContinousFileReaderOperator}} in 1.10. This strikes me as a pragmatic 
solution, though it will impact performance for some users who aren't impacted 
by the bug. Would it make sense to try to optimize this a bit, and only disable 
chaining when the time characteristic is event time?

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-10 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193526#comment-17193526
 ] 

Piotr Nowojski commented on FLINK-19109:


As I wrote, I don't think we can not backport the proper fix. 
[~roman_khachatryan]'s fix looks like can deadlock. Maybe it could be solved, 
but it's still hacky and risky. If you insist on fixing it in 1.10, I would 
suggest to disable chaining for the {{ContinoutFileReaderOperator}}.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-10 Thread Aljoscha Krettek (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193484#comment-17193484
 ] 

Aljoscha Krettek commented on FLINK-19109:
--

I think we definitely need to find a fix for 1.10.x, as David said. This is 
quite critical and users (and me) can waste quite some time diagnosing what is 
happening.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-07 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191818#comment-17191818
 ] 

Roman Khachatryan commented on FLINK-19109:
---

Thanks [~zjwang] for merging the PR.

 

Regarding 1.10, I tried to use yield in this draft PR: 
[https://github.com/apache/flink/pull/13342]

I agree with [~pnowojski] that it can destabilize 1.10: yield() can lead to 
indefinite wait for a mail, particularly because timer service can be shut down 
prematurely in 1.10.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-07 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191651#comment-17191651
 ] 

Piotr Nowojski commented on FLINK-19109:


[~alpinegizmo] those two PRs that the fix depends on are quite big changes 
(+2000 lines of code), with some follow up bug fixes. Back porting them could 
mean destabilising the 1.10.x release branch, especially given that our release 
testing for minor changes is not that good. Furthermore, they are changing 
behaviour of the system, for example the [ContinuousFileReaderOperator 
changes|https://github.com/apache/flink/pull/10435] will be affecting 
performance (hopefully improving), while the other one FLINK-14228 is changing 
semantic of timers during shutdown.

Is there other way to fix the problem [~roman_khachatryan]?

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-07 Thread Zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191577#comment-17191577
 ] 

Zhijiang commented on FLINK-19109:
--

[~zhuzh][~roman_khachatryan], I will merge it in 1.11 after azure pass, I think 
it can be done later today.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-07 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191537#comment-17191537
 ] 

Zhu Zhu commented on FLINK-19109:
-

Thanks for the updates! [~roman_khachatryan]

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-07 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191533#comment-17191533
 ] 

Roman Khachatryan commented on FLINK-19109:
---

Hi Zhu Zhu,

Yes, it will be merged today (I asked [~zjwang] to do that).

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-07 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191487#comment-17191487
 ] 

Zhu Zhu commented on FLINK-19109:
-

Hi [~roman_khachatryan], do you think we can get the fix merged today? 
I'm asking this because it is the deadline to cut 1.11.2 RC1 later today.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-03 Thread David Anderson (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190180#comment-17190180
 ] 

David Anderson commented on FLINK-19109:


I think we should try harder to find a fix for 1.10. 1.10 is still supported, 
and while there are workarounds this is a difficult problem to diagnose. Not 
all users will find these workarounds on their own.

I tracked this down after hearing from a user who wasted a day trying to figure 
out what was going on with an event time streaming app where none of the 
windows were closing until the end of the job. Getting from there to the actual 
problem requires a pretty solid understanding of the internals. 

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-02 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189373#comment-17189373
 ] 

Roman Khachatryan commented on FLINK-19109:
---

I've published a [PR|https://github.com/apache/flink/pull/13305] to fix the 
issue in 1.12 and 1.11.

For 1.10, the fix would be problematic as it would likely require backporting 
[changes to 
ContinuousFileReaderOperator|https://github.com/apache/flink/pull/10435] and 
[OperatorChain / timers|https://github.com/apache/flink/pull/10151].  Given 
that there are several workarounds, I think the fix for 1.11 and 1.12 will 
suffice.

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.12.0, 1.11.2, 1.10.3
>
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-19109) Split Reader eats chained periodic watermarks

2020-09-01 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188635#comment-17188635
 ] 

Roman Khachatryan commented on FLINK-19109:
---

I see that with chaining enabled TimestampsAndWatermarksOperator works as 
expected - until ContinuousFileReaderOperator starts reading first elements. 
After that, it schedules a timer which is executed with 1-2 second delay. 

This delay is caused by MailboxProcessor not picking up a mail of an already 
fired timer (timer services are OK).

This seems reasonable since the priority of operators MailboxExecutor is 
defined by its chain index.

(the chain is ContinuousFileReaderOperator -> Map -> 
TimestampsAndWatermarksOperator).

 

[~pnowojski] does it makes sense to you?

Do you have any idea how to fix this?

> Split Reader eats chained periodic watermarks
> -
>
> Key: FLINK-19109
> URL: https://issues.apache.org/jira/browse/FLINK-19109
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.10.0, 1.10.1, 1.11.0, 1.10.2, 1.11.1
>Reporter: David Anderson
>Assignee: Roman Khachatryan
>Priority: Critical
>
> Attempting to generate watermarks chained to the Split Reader / 
> ContinuousFileReaderOperator, as in
> {code:java}
> SingleOutputStreamOperator results = env
>   .readTextFile(...)
>   .map(...)
>   .assignTimestampsAndWatermarks(bounded)
>   .keyBy(...)
>   .process(...);{code}
> leads to the Watermarks failing to be produced. Breaking the chain, via 
> {{disableOperatorChaining()}} or a {{rebalance}}, works around the bug. Using 
> punctuated watermarks also avoids the issue.
> Looking at this in the debugger reveals that timer service is being 
> prematurely quiesced.
> In many respects this is FLINK-7666 brought back to life.
> The problem is not present in 1.9.3.
> There's a minimal reproducible example in 
> [https://github.com/alpinegizmo/flink-question-001/tree/bug].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)