[jira] [Closed] (FLINK-28131) FLIP-168: Speculative Execution for Batch Job

2022-09-26 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28131.
---
Release Note: 
Speculative execution(FLIP-168) is introduced in Flink 1.16 to mitigate batch 
job slowness which is caused by problematic nodes. A problematic node may have 
hardware problems, accident I/O busy, or high CPU load. These problems may make 
the hosted tasks run much slower than tasks on other nodes, and affect the 
overall execution time of a batch job.

When speculative execution is enabled, Flink will keep detecting slow tasks. 
Once slow tasks are detected, the nodes that the slow tasks locate in will be 
identified as problematic nodes and get blocked via the blocklist 
mechanism(FLIP-224). The scheduler will create new attempts for the slow tasks 
and deploy them to nodes that are not blocked, while the existing attempts will 
keep running. The new attempts process the same input data and produce the same 
data as the original attempt. Once any attempt finishes first, it will be 
admitted as the only finished attempt of the task, and the remaining attempts 
of the task will be canceled.

Most existing sources can work with speculative execution(FLIP-245). Only if a 
source uses SourceEvent, it must implement 
SupportsHandleExecutionAttemptSourceEvent to support speculative execution. 
Sinks do not support speculative execution yet so that speculative execution 
will not happen on sinks at the moment.

The Web UI & REST API are also improved(FLIP-249) to display multiple 
concurrent attempts of tasks and blocked task managers.
  Resolution: Done

> FLIP-168: Speculative Execution for Batch Job
> -
>
> Key: FLINK-28131
> URL: https://issues.apache.org/jira/browse/FLINK-28131
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative executions is helpful to mitigate slow tasks caused by 
> problematic nodes. The basic idea is to start mirror tasks on other nodes 
> when a slow task is detected. The mirror task processes the same input data 
> and produces the same data as the original task. 
> More detailed can be found in 
> [FLIP-168|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job].]
>  
> This is the umbrella ticket to track all the changes of this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (FLINK-28131) FLIP-168: Speculative Execution for Batch Job

2022-09-26 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reopened FLINK-28131:
-

> FLIP-168: Speculative Execution for Batch Job
> -
>
> Key: FLINK-28131
> URL: https://issues.apache.org/jira/browse/FLINK-28131
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative executions is helpful to mitigate slow tasks caused by 
> problematic nodes. The basic idea is to start mirror tasks on other nodes 
> when a slow task is detected. The mirror task processes the same input data 
> and produces the same data as the original task. 
> More detailed can be found in 
> [FLIP-168|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job].]
>  
> This is the umbrella ticket to track all the changes of this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-09-02 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28981.
---
Resolution: Done

> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Yunhong Zheng
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  -- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  -- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  -- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-09-02 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17599499#comment-17599499
 ] 

Zhu Zhu commented on FLINK-28981:
-

Thanks for helping with this release testing! [~337361...@qq.com]

> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Yunhong Zheng
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  -- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  -- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  -- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598711#comment-17598711
 ] 

Zhu Zhu commented on FLINK-28980:
-

Thanks for helping with the release testing! [~SleePy]

> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Biao Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
> Attachments: flink-root-standalonesession-0-VM_38_195_centos.log, 
> flink-root-taskexecutor-0-VM_199_24_centos.log, 
> flink-root-taskexecutor-0-VM_38_195_centos.log, screenshot1, screenshot2, 
> stdout
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-31 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28980.
---
Resolution: Done

> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Biao Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
> Attachments: flink-root-standalonesession-0-VM_38_195_centos.log, 
> flink-root-taskexecutor-0-VM_199_24_centos.log, 
> flink-root-taskexecutor-0-VM_38_195_centos.log, screenshot1, screenshot2, 
> stdout
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28907) Flink docs do not compile locally

2022-08-31 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28907:

Fix Version/s: (was: 1.16.0)

> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
>
> Flink docs fail to compile locally. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28907) Flink docs do not compile locally

2022-08-31 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28907.
---
Resolution: Won't Fix

> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs fail to compile locally. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28907) Flink docs do not compile locally

2022-08-31 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598246#comment-17598246
 ] 

Zhu Zhu commented on FLINK-28907:
-

Got it! Thanks for the explanation.

> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs fail to compile locally. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28907) Flink docs do not compile locally

2022-08-30 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598146#comment-17598146
 ] 

Zhu Zhu commented on FLINK-28907:
-

Thanks for looking into this! [~martijnvisser]

Do you mean the process below should happen, but when there is connection 
issue, it will not happen and will not report errors about the connection 
problem?
> go: downloading github.com/apache/flink-connector-elasticsearch ...






> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs fail to compile locally. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28940) Release Testing: Verify FLIP-248 Dynamic Partition Prunning

2022-08-29 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28940.
---
Resolution: Done

> Release Testing: Verify FLIP-248 Dynamic Partition Prunning
> ---
>
> Key: FLINK-28940
> URL: https://issues.apache.org/jira/browse/FLINK-28940
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner, Table SQL / Runtime
>Affects Versions: 1.16.0
>Reporter: godfrey he
>Assignee: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> This issue aims to verify FLIP-248: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning
> We can verify it in SQL client after we build the flink-dist package.
> 1. create a partition table and a non-partition table (only hive connector is 
> supported now, or we need write a new collector), and then insert some data
> 2. show the explain result for a join query, whose one side contains a 
> partition table and other side is non-partition table with a filter, such as 
> the example in the FLIP doc: select * from store_returns, date_dim where 
> sr_returned_date_sk = d_date_sk and d_year = 2000. The explain result should 
> contain `DynamicFilteringDataCollector` node.  We can also verify plan for 
> the various variants of above query.
> 3. execute the above plan and verify the execution result. (the execution 
> result should be same with the execution plan which disable dynamic filtering 
> via table.optimizer.dynamic-filtering.enabled)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28940) Release Testing: Verify FLIP-248 Dynamic Partition Prunning

2022-08-29 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597069#comment-17597069
 ] 

Zhu Zhu commented on FLINK-28940:
-

I have tested it and it looks good to me.
I used sql client to do the test, by connecting it to Hive. By running testing 
jobs, I can see that DPP is taking effect: the topology is modified to have a 
{{DynamicFilteringDataCollector}} and {{Order-Enforcer}}. By comparing the 
number of input records of the join operator, I can see that expected number of 
records are  truely filtered out in ahead. The job result is also as expected.


> Release Testing: Verify FLIP-248 Dynamic Partition Prunning
> ---
>
> Key: FLINK-28940
> URL: https://issues.apache.org/jira/browse/FLINK-28940
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner, Table SQL / Runtime
>Affects Versions: 1.16.0
>Reporter: godfrey he
>Assignee: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> This issue aims to verify FLIP-248: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning
> We can verify it in SQL client after we build the flink-dist package.
> 1. create a partition table and a non-partition table (only hive connector is 
> supported now, or we need write a new collector), and then insert some data
> 2. show the explain result for a join query, whose one side contains a 
> partition table and other side is non-partition table with a filter, such as 
> the example in the FLIP doc: select * from store_returns, date_dim where 
> sr_returned_date_sk = d_date_sk and d_year = 2000. The explain result should 
> contain `DynamicFilteringDataCollector` node.  We can also verify plan for 
> the various variants of above query.
> 3. execute the above plan and verify the execution result. (the execution 
> result should be same with the execution plan which disable dynamic filtering 
> via table.optimizer.dynamic-filtering.enabled)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-28 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28981:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
[document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
[document|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Yunhong Zheng
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  -- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  -- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  -- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to 

[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-28 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
[documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
[documentation|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Biao Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-28 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
[documentation|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Biao Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> [documentation|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-28 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28981:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
[document|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/.

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Yunhong Zheng
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> [document|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  -- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  -- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  -- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable 

[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-28 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28981:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/.

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Yunhong Zheng
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/.
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  -- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  -- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  -- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you 

[jira] [Commented] (FLINK-28940) Release Testing: Verify FLIP-248 Dynamic Partition Prunning

2022-08-24 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17584162#comment-17584162
 ] 

Zhu Zhu commented on FLINK-28940:
-

I'm working on it and it may need a bit more time.

> Release Testing: Verify FLIP-248 Dynamic Partition Prunning
> ---
>
> Key: FLINK-28940
> URL: https://issues.apache.org/jira/browse/FLINK-28940
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner, Table SQL / Runtime
>Affects Versions: 1.16.0
>Reporter: godfrey he
>Assignee: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> This issue aims to verify FLIP-248: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning
> We can verify it in SQL client after we build the flink-dist package.
> 1. create a partition table and a non-partition table (only hive connector is 
> supported now, or we need write a new collector), and then insert some data
> 2. show the explain result for a join query, whose one side contains a 
> partition table and other side is non-partition table with a filter, such as 
> the example in the FLIP doc: select * from store_returns, date_dim where 
> sr_returned_date_sk = d_date_sk and d_year = 2000. The explain result should 
> contain `DynamicFilteringDataCollector` node.  We can also verify plan for 
> the various variants of above query.
> 3. execute the above plan and verify the execution result. (the execution 
> result should be same with the execution plan which disable dynamic filtering 
> via table.optimizer.dynamic-filtering.enabled)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28139) Add documentation for speculative execution

2022-08-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28139.
---
Resolution: Done

Done via 70d9f6c31b289b6ea284a02fdb1d8cfc1a1a5414

> Add documentation for speculative execution
> ---
>
> Key: FLINK-28139
> URL: https://issues.apache.org/jira/browse/FLINK-28139
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28213) StreamExecutionEnvironment configure method support override pipeline.jars option

2022-08-17 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580685#comment-17580685
 ] 

Zhu Zhu commented on FLINK-28213:
-

I agree that we should not change the semantics of the {{PublicEvolving}} 
method {{StreamExecutionEnvironment#configure()}}, from setting to merging.

Maybe we can add the user set {{PipelineOptions.JARS}} to 
{{TableConfig.configuration}} on table environment initialization. Later the 
table module can add new jars into {{PipelineOptions.JARS}} in 
{{TableConfig.configuration}}. And finally the {{PipelineOptions.JARS}} in 
{{TableConfig.configuration}} can just override that in the {{configuration}} 
in {{StreamExecutionEnvironment}} via 
{{StreamExecutionEnvironment#configure()}}.

This behavior can be explained as the table module is enriching/modifying the 
user set {{PipelineOptions.JARS}} with program options or job configs, which I 
think is acceptable, because it's already happening (e.g. 
ExecutionConfigAccessor#fromProgramOptions(...)).

> StreamExecutionEnvironment configure method support override pipeline.jars 
> option
> -
>
> Key: FLINK-28213
> URL: https://issues.apache.org/jira/browse/FLINK-28213
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Configuration
>Affects Versions: 1.16.0
>Reporter: dalongliu
>Assignee: dalongliu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-16 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28980:
---

Assignee: Biao Liu

> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Biao Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-16 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28981:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 -- [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 -- [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 -- [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 - [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 - [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 - [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  -- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  -- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  -- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was 

[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-16 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Labels: release-testing  (was: test-stability)

> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-16 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28981:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
 - [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
 - [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
 - [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
   - [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
   - [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
   - [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>  - [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>  - [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>  - [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by 

[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-16 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28981:

Labels: release-testing  (was: test-stability)

> Release Testing: Verify FLIP-245 sources speculative execution
> --
>
> Key: FLINK-28981
> URL: https://issues.apache.org/jira/browse/FLINK-28981
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
> FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write Flink jobs which has some {{source}} subtasks running much slower 
> than others. 3 kinds of sources should be verified, including
>- [Source 
> functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
>- [InputFormat 
> sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
>- [FLIP-27 new 
> sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28130) FLIP-224: Blocklist Mechanism

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28130.
---
Resolution: Done

> FLIP-224: Blocklist Mechanism
> -
>
> Key: FLINK-28130
> URL: https://issues.apache.org/jira/browse/FLINK-28130
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> In order to support speculative execution for batch 
> jobs([FLIP-168|https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job]),
>  we need a mechanism to block resources on nodes where the slow tasks are 
> located. We propose to introduce a blocklist mechanism as follows:  Once a 
> node is marked as blocked, future slots should not be allocated from the 
> blocked node, but the slots that are already allocated will not be affected.
> More details see 
> [FLIP-224|https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism]
> This is the umbrella ticket to track all the changes of this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28587) FLIP-249: Flink Web UI Enhancement for Speculative Execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28587.
---
Fix Version/s: 1.16.0
   Resolution: Done

> FLIP-249: Flink Web UI Enhancement for Speculative Execution
> 
>
> Key: FLINK-28587
> URL: https://issues.apache.org/jira/browse/FLINK-28587
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / REST, Runtime / Web Frontend
>Affects Versions: 1.16.0
>Reporter: Gen Luo
>Assignee: Gen Luo
>Priority: Major
> Fix For: 1.16.0
>
>
> As a follow-up step of FLIP-168 and FLIP-224, the Flink Web UI needs to be 
> enhanced to display the related information if the speculative execution 
> mechanism is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28397) [FLIP-245] Source Supports Speculative Execution For Batch Job

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28397.
---
Resolution: Done

> [FLIP-245] Source Supports Speculative Execution For Batch Job
> --
>
> Key: FLINK-28397
> URL: https://issues.apache.org/jira/browse/FLINK-28397
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Jing Zhang
>Assignee: Jing Zhang
>Priority: Major
> Fix For: 1.16.0
>
>
> This is the umbrella ticket of 
> [FLIP-245|https://cwiki.apache.org/confluence/display/FLINK/FLIP-245%3A+Source+Supports+Speculative+Execution+For+Batch+Job].
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution

2022-08-15 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28981:
---

 Summary: Release Testing: Verify FLIP-245 sources speculative 
execution
 Key: FLINK-28981
 URL: https://issues.apache.org/jira/browse/FLINK-28981
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Common, Runtime / Coordination
Reporter: Zhu Zhu
 Fix For: 1.16.0


Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and 
FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write Flink jobs which has some {{source}} subtasks running much slower than 
others. 3 kinds of sources should be verified, including
   - [Source 
functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java]
   - [InputFormat 
sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java]
   - [FLIP-27 new 
sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java]
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution
This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution
This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes.
This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes.
This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

More details about this feature and how to use it can be found in this 
documentation [PR|https://github.com/apache/flink/pull/20507].

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes.
> This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this 
> documentation [PR|https://github.com/apache/flink/pull/20507].
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.


To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. More details about this feature can be found in 
> this documentation [PR|https://github.com/apache/flink/pull/20507].
> This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.


To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. More details about this feature can be found in 
> this documentation [PR|https://github.com/apache/flink/pull/20507].
> This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28980:

Description: 
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.

To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.

  was:
Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.


> Release Testing: Verify FLIP-168 speculative execution
> --
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
> tasks caused by slow nodes. More details about this feature can be found in 
> this documentation [PR|https://github.com/apache/flink/pull/20507].
> This feature currently consists of 4 FLIPs:
>  - FLIP-168: Speculative Execution core part
>  - FLIP-224: Blocklist Mechanism
>  - FLIP-245: Source Supports Speculative Execution
>  - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
> To do the verification, the process can be:
>  - Write a Flink job which has a subtask running much slower than others 
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be 
> retrieved via InetAddress.getLocalHost().getHostName(), or if its 
> (subtaskIndex + attemptNumer) % 2 == 0)
>  - Modify Flink configuration file to enable speculative execution and tune 
> the configuration as you like
>  - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution

2022-08-15 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28980:
---

 Summary: Release Testing: Verify FLIP-168 speculative execution
 Key: FLINK-28980
 URL: https://issues.apache.org/jira/browse/FLINK-28980
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhu Zhu
 Fix For: 1.16.0


Speculative execution is introduced in Flink 1.16 to deal with temporary slow 
tasks caused by slow nodes. More details about this feature can be found in 
this documentation [PR|https://github.com/apache/flink/pull/20507].

This feature currently consists of 4 FLIPs:
 - FLIP-168: Speculative Execution core part
 - FLIP-224: Blocklist Mechanism
 - FLIP-245: Source Supports Speculative Execution
 - FLIP-249: Flink Web UI Enhancement for Speculative Execution

This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249.
To do the verification, the process can be:
 - Write a Flink job which has a subtask running much slower than others (e.g. 
sleep indefinitely if it runs on a certain host, the hostname can be retrieved 
via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + 
attemptNumer) % 2 == 0)
 - Modify Flink configuration file to enable speculative execution and tune the 
configuration as you like
 - Submit the job. Checking the web UI, logs, metrics and produced result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-14 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579512#comment-17579512
 ] 

Zhu Zhu edited comment on FLINK-28878 at 8/15/22 3:44 AM:
--

Fixed via 
master: 5f8f387cba774a2c3900ea38e8a3dad017cf1790
release-1.15: ded03b750f46d8636d6744d4e094943d04f787dd


was (Author: zhuzh):
Fixed via 5f8f387cba774a2c3900ea38e8a3dad017cf1790

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.14.5, 1.15.1, 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-14 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28878:

Fix Version/s: 1.15.2

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.14.5, 1.15.1, 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0, 1.15.2
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-14 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28878.
---
Resolution: Fixed

Fixed via 5f8f387cba774a2c3900ea38e8a3dad017cf1790

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-14 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28878:

Component/s: Tests
 (was: Runtime / Coordination)

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.15.1, 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-14 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28878:

Affects Version/s: 1.14.5

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.14.5, 1.15.1, 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-14 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28878:

Affects Version/s: 1.15.1

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.15.1, 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28766) UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException

2022-08-12 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578887#comment-17578887
 ] 

Zhu Zhu commented on FLINK-28766:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39908=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7

> UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException
> -
>
> Key: FLINK-28766
> URL: https://issues.apache.org/jira/browse/FLINK-28766
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-01T01:36:16.0563880Z Aug 01 01:36:16 [ERROR] 
> org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest
>   Time elapsed: 12.579 s  <<< ERROR!
> 2022-08-01T01:36:16.0565407Z Aug 01 01:36:16 java.io.UncheckedIOException: 
> java.nio.file.NoSuchFileException: 
> /tmp/junit1058240190382532303/f0f99754a53d2c4633fed75011da58dd/chk-7/61092e4a-5b9a-4f56-83f7-d9960c53ed3e
> 2022-08-01T01:36:16.0566296Z Aug 01 01:36:16  at 
> java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88)
> 2022-08-01T01:36:16.0566972Z Aug 01 01:36:16  at 
> java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104)
> 2022-08-01T01:36:16.0567600Z Aug 01 01:36:16  at 
> java.util.Iterator.forEachRemaining(Iterator.java:115)
> 2022-08-01T01:36:16.0568290Z Aug 01 01:36:16  at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
> 2022-08-01T01:36:16.0569172Z Aug 01 01:36:16  at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> 2022-08-01T01:36:16.0569877Z Aug 01 01:36:16  at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> 2022-08-01T01:36:16.0570554Z Aug 01 01:36:16  at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> 2022-08-01T01:36:16.0571371Z Aug 01 01:36:16  at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> 2022-08-01T01:36:16.0572417Z Aug 01 01:36:16  at 
> java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546)
> 2022-08-01T01:36:16.0573618Z Aug 01 01:36:16  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:289)
> 2022-08-01T01:36:16.0575187Z Aug 01 01:36:16  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:262)
> 2022-08-01T01:36:16.0576540Z Aug 01 01:36:16  at 
> org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:158)
> 2022-08-01T01:36:16.0577684Z Aug 01 01:36:16  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-08-01T01:36:16.0578546Z Aug 01 01:36:16  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-08-01T01:36:16.0579374Z Aug 01 01:36:16  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-08-01T01:36:16.0580298Z Aug 01 01:36:16  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-08-01T01:36:16.0581243Z Aug 01 01:36:16  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2022-08-01T01:36:16.0582029Z Aug 01 01:36:16  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2022-08-01T01:36:16.0582766Z Aug 01 01:36:16  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2022-08-01T01:36:16.0583488Z Aug 01 01:36:16  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2022-08-01T01:36:16.0584203Z Aug 01 01:36:16  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2022-08-01T01:36:16.0585087Z Aug 01 01:36:16  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2022-08-01T01:36:16.0585778Z Aug 01 01:36:16  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 2022-08-01T01:36:16.0586482Z Aug 01 01:36:16  at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
> 2022-08-01T01:36:16.0587155Z Aug 01 01:36:16  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2022-08-01T01:36:16.0587809Z Aug 01 01:36:16  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> 2022-08-01T01:36:16.0588434Z Aug 01 01:36:16  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2022-08-01T01:36:16.0589203Z Aug 01 01:36:16  at 

[jira] [Assigned] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-11 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28878:
---

Assignee: Zhu Zhu

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-11 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28878:

Fix Version/s: 1.16.0

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: test-stability
> Fix For: 1.16.0
>
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError

2022-08-11 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578780#comment-17578780
 ] 

Zhu Zhu commented on FLINK-28878:
-

Thanks for reporting it! [~hxbks2ks]
The test fails due to an unexpected slowness of test running (may be due to an 
environment slowness). The slowness resulted in a slow request timeout and 
triggered a failover. This made the job failover number to be 2 instead of the 
expected 1.
Will increase the slot request timeout to make the tests more stable.

> PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with 
> AssertionError
> 
>
> Key: FLINK-28878
> URL: https://issues.apache.org/jira/browse/FLINK-28878
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException
>   Time elapsed: 20.288 s  <<< FAILURE!
> 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: 
> 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 
> 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is 
> 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43  but: was 
> 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43  at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:964)
> 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43  at 
> org.junit.Assert.assertThat(Assert.java:930)
> 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43  at 
> org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28907) Flink docs do not compile locally

2022-08-10 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28907:

Description: 
Flink docs fail to compile locally. The error is as below:

go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
v0.0.0-20220715033920-cbeb08187b3a
hugo: collected modules in 1832 ms
Start building sites …
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead.
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
not found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead.
Built in 6415 ms
Error: Error building site: logged 6 error(s)

  was:
Flink docs suddenly fail to compile in my local environment, without no new 
change or rebase. The error is as below:

go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
v0.0.0-20220715033920-cbeb08187b3a
hugo: collected modules in 1832 ms
Start building sites …
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead.
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
not found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead.
Built in 6415 ms
Error: Error building site: logged 6 error(s)


> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs fail to compile locally. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> 

[jira] [Commented] (FLINK-28907) Flink docs do not compile locally

2022-08-10 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577905#comment-17577905
 ] 

Zhu Zhu commented on FLINK-28907:
-

Tried building docs of latest Flink master on other dev machines, this problem 
does not happen.
So possibly this is not a common problem.

> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs suddenly fail to compile in my local environment, without no new 
> change or rebase. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28907) Flink docs do not compile locally

2022-08-10 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28907:

Summary: Flink docs do not compile locally  (was: Flink docs do not compile)

> Flink docs do not compile locally
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs suddenly fail to compile in my local environment, without no new 
> change or rebase. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28907) Flink docs do not compile

2022-08-10 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28907:

Priority: Major  (was: Blocker)

> Flink docs do not compile
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Flink docs suddenly fail to compile in my local environment, without no new 
> change or rebase. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28907) Flink docs do not compile

2022-08-10 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28907:

Summary: Flink docs do not compile  (was: Flink docs does not compile)

> Flink docs do not compile
> -
>
> Key: FLINK-28907
> URL: https://issues.apache.org/jira/browse/FLINK-28907
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Priority: Blocker
> Fix For: 1.16.0
>
>
> Flink docs suddenly fail to compile in my local environment, without no new 
> change or rebase. The error is as below:
> go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
> v0.0.0-20220715033920-cbeb08187b3a
> hugo: collected modules in 1832 ms
> Start building sites …
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not 
> found
> ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' 
> instead.
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
> not found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/datastream/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
> found
> ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
> "docs/connectors/table/elasticsearch": 
> "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
> WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' 
> instead.
> Built in 6415 ms
> Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28907) Flink docs does not compile

2022-08-10 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28907:
---

 Summary: Flink docs does not compile
 Key: FLINK-28907
 URL: https://issues.apache.org/jira/browse/FLINK-28907
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.16.0
Reporter: Zhu Zhu
 Fix For: 1.16.0


Flink docs suddenly fail to compile in my local environment, without no new 
change or rebase. The error is as below:

go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
v0.0.0-20220715033920-cbeb08187b3a
hugo: collected modules in 1832 ms
Start building sites …
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead.
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
not found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead.
Built in 6415 ms
Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28873) Make "jobmanager.scheduler" visible in documentation

2022-08-09 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28873.
---
Resolution: Fixed

Fixed via 15f87d4a470e9bf29fd18874c26c4506ea57c09f

> Make "jobmanager.scheduler" visible in documentation
> 
>
> Key: FLINK-28873
> URL: https://issues.apache.org/jira/browse/FLINK-28873
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Currently, the option {{jobmanager.scheduler}} is still excluded from 
> documentation. But in fact, this option is already used as a public interface 
> (this option needs to be configured by users when using AdaptiveScheduler and 
> AdaptiveBatchScheduler).
> We should remove the {{ExcludeFromDocumentation}} to make it visible in the 
> documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28873) Make "jobmanager.scheduler" visible in documentation

2022-08-09 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28873:
---

Assignee: Lijie Wang

> Make "jobmanager.scheduler" visible in documentation
> 
>
> Key: FLINK-28873
> URL: https://issues.apache.org/jira/browse/FLINK-28873
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Currently, the option {{jobmanager.scheduler}} is still excluded from 
> documentation. But in fact, this option is already used as a public interface 
> (this option needs to be configured by users when using AdaptiveScheduler and 
> AdaptiveBatchScheduler).
> We should remove the {{ExcludeFromDocumentation}} to make it visible in the 
> documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28324) JUnit5 Migration] Module: flink-sql-client

2022-08-04 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28324:

Priority: Major  (was: Minor)

> JUnit5 Migration] Module: flink-sql-client
> --
>
> Key: FLINK-28324
> URL: https://issues.apache.org/jira/browse/FLINK-28324
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: zhouli
>Assignee: zhouli
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-28324) JUnit5 Migration] Module: flink-sql-client

2022-08-04 Thread Zhu Zhu (Jira)


[ https://issues.apache.org/jira/browse/FLINK-28324 ]


Zhu Zhu deleted comment on FLINK-28324:
-

was (Author: flink-jira-bot):
I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 30 days, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it.


> JUnit5 Migration] Module: flink-sql-client
> --
>
> Key: FLINK-28324
> URL: https://issues.apache.org/jira/browse/FLINK-28324
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: zhouli
>Assignee: zhouli
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28324) JUnit5 Migration] Module: flink-sql-client

2022-08-04 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28324:

Labels: pull-request-available  (was: pull-request-available stale-assigned)

> JUnit5 Migration] Module: flink-sql-client
> --
>
> Key: FLINK-28324
> URL: https://issues.apache.org/jira/browse/FLINK-28324
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Client
>Reporter: zhouli
>Assignee: zhouli
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28139) Add documentation for speculative execution

2022-08-04 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28139:
---

Assignee: Zhu Zhu

> Add documentation for speculative execution
> ---
>
> Key: FLINK-28139
> URL: https://issues.apache.org/jira/browse/FLINK-28139
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-27710) Improve logs to better display Execution

2022-08-04 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-27710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-27710:

Fix Version/s: (was: 1.16.0)

> Improve logs to better display Execution
> 
>
> Key: FLINK-27710
> URL: https://issues.apache.org/jira/browse/FLINK-27710
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>
> Currently, an execution is usually represented as "{{{}job vertex name{}}} 
> ({{{}subtaskIndex+1{}}}/{{{}vertex parallelism{}}}) ({{{}attemptId{}}})" in 
> logs, which may be redundant after this refactoring work. With the change of 
> FLINK-17295, the representation of Execution in logs will be redundant. e.g. 
> the subtask index is displayed 2 times.
> Therefore, I'm proposing to change the format to be "<{{{}job vertex name> 
> {{(<{{{}subtaskIndex>+1{}}}/<{{{}vertex parallelism>{}}}) 
> {{#}}  (graph: <{{{}short ExecutionGraphID>, vertex: 
> <{}}}{{{}JobVertexID>{}}}) " and avoid directly display the 
> {{{}ExecutionAttemptID{}}}. This can increase the log readability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28769) Flink History Server show wrong name of batch jobs

2022-08-03 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575001#comment-17575001
 ] 

Zhu Zhu commented on FLINK-28769:
-

Loos to me the the specified name does work after checking the code of 
{{ExecutionEnvironment#executeAsync(java.lang.String)}}.
I guess your batch job is not set with a "output" param so that the job is 
created via {{counts.print()}} instead of {{env.execute(...)}}. This is a known 
limitation of DataSet and it's less likely we will change the interface of 
{{counts.print()}} because {{DataSet}} will be deprecated soon.

> Flink History Server show wrong name of batch jobs
> --
>
> Key: FLINK-28769
> URL: https://issues.apache.org/jira/browse/FLINK-28769
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataSet
>Reporter: Biao Geng
>Priority: Minor
> Attachments: image-2022-08-02-00-41-51-815.png
>
>
> When running {{examples/batch/WordCount.jar}} using flink1.15 and 1.16 
> together with history server started, the history server shows default 
> name(e.g. Flink Java Job at Tue Aug 02.. ) of the batch job instead of the 
> name( "WordCount Example" ) specified in the java code.
> But for {{examples/streaming/WordCount.jar}}, the job name in history server 
> is correct.
> It looks like that 
> {{org.apache.flink.api.java.ExecutionEnvironment#executeAsync(java.lang.String)}}
>  does not set job name as what 
> {{org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#execute(java.lang.String)}}
>  does(e.g. streamGraph.setJobName(jobName); ).
> !image-2022-08-02-00-41-51-815.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-23174) Log improvement in Task throws Error

2022-08-02 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-23174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-23174.
---
Resolution: Won't Fix

Close due to inactive and no response.

> Log improvement in Task throws Error
> 
>
> Key: FLINK-23174
> URL: https://issues.apache.org/jira/browse/FLINK-23174
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Network
>Affects Versions: 1.13.1
>Reporter: Bo Cui
>Assignee: Bo Cui
>Priority: Not a Priority
>  Labels: pull-request-available, stale-assigned
>
> we met some channels close due to network jitter and task fail.
> we can only see which remote channel causes the task/job failure. 
> but we can not know more details, such as which channel close, task stack...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28771) Assign speculative execution attempt with correct CREATED timestamp

2022-08-02 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28771.
---
Resolution: Fixed

Fixed via e1a74df4427e99f4b0f3aaa4e8f4f5ff7cbd044e

> Assign speculative execution attempt with correct CREATED timestamp
> ---
>
> Key: FLINK-28771
> URL: https://issues.apache.org/jira/browse/FLINK-28771
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Currently, newly created speculative execution attempt is assigned with a 
> wrong CREATED timestamp in SpeculativeScheduler. We need to fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests

2022-08-02 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28759.
---
Resolution: Done

Done via 16b0cc1117d4bda11b89440e962646024b2ff6c5

> Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
> ---
>
> Key: FLINK-28759
> URL: https://issues.apache.org/jira/browse/FLINK-28759
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Tests
>Reporter: Zhu Zhu
>Assignee: JUNRUILi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> To verify the correctness of speculative execution, we can enabled it in 
> AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch 
> jobs and verifies the result.
> Note that we need to disable the blocklist (by setting block duration to 0) 
> in such single machine e2e tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28589) Enhance Web UI for Speculative Execution

2022-08-01 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28589.
---
Fix Version/s: 1.16.0
   Resolution: Done

Done via 
4af75011fe32506a70076a2fe9e847cef39587bb
57990c332f3d87e4bcc1824973ac5ed2bcafec85
f6c5dc1b32ad6e6b524e549eff2b7d9d2b7d9970

> Enhance Web UI for Speculative Execution
> 
>
> Key: FLINK-28589
> URL: https://issues.apache.org/jira/browse/FLINK-28589
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Affects Versions: 1.16.0
>Reporter: Gen Luo
>Assignee: Junhan Yang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28771) Assign speculative execution attempt with correct CREATED timestamp

2022-08-01 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28771:
---

 Summary: Assign speculative execution attempt with correct CREATED 
timestamp
 Key: FLINK-28771
 URL: https://issues.apache.org/jira/browse/FLINK-28771
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.16.0
Reporter: Zhu Zhu
Assignee: Zhu Zhu
 Fix For: 1.16.0


Currently, newly created speculative execution attempt is assigned with a wrong 
CREATED timestamp in SpeculativeScheduler. We need to fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests

2022-08-01 Thread Zhu Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573568#comment-17573568
 ] 

Zhu Zhu commented on FLINK-28759:
-

[~JunRuiLi] I have assign you the ticket. Go a head to open a PR for it.

> Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
> ---
>
> Key: FLINK-28759
> URL: https://issues.apache.org/jira/browse/FLINK-28759
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Tests
>Reporter: Zhu Zhu
>Assignee: JUNRUILi
>Priority: Major
> Fix For: 1.16.0
>
>
> To verify the correctness of speculative execution, we can enabled it in 
> AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch 
> jobs and verifies the result.
> Note that we need to disable the blocklist (by setting block duration to 0) 
> in such single machine e2e tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests

2022-08-01 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28759:
---

Assignee: JUNRUILi

> Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
> ---
>
> Key: FLINK-28759
> URL: https://issues.apache.org/jira/browse/FLINK-28759
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Tests
>Reporter: Zhu Zhu
>Assignee: JUNRUILi
>Priority: Major
> Fix For: 1.16.0
>
>
> To verify the correctness of speculative execution, we can enabled it in 
> AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch 
> jobs and verifies the result.
> Note that we need to disable the blocklist (by setting block duration to 0) 
> in such single machine e2e tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests

2022-08-01 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28759:
---

 Summary: Enable speculative execution for in 
AdaptiveBatchScheduler TPC-DS e2e tests
 Key: FLINK-28759
 URL: https://issues.apache.org/jira/browse/FLINK-28759
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination, Tests
Reporter: Zhu Zhu
 Fix For: 1.16.0


To verify the correctness of speculative execution, we can enabled it in 
AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch 
jobs and verifies the result.
Note that we need to disable the blocklist (by setting block duration to 0) in 
such single machine e2e tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28696) Notify all newlyAdded/Merged blocked nodes to BlocklistListener

2022-07-26 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28696.
---
Resolution: Fixed

Fixed via bbaeb628f48a4bc4c324bfc4afd06ddf34f546f0

> Notify all newlyAdded/Merged blocked nodes to BlocklistListener
> ---
>
> Key: FLINK-28696
> URL: https://issues.apache.org/jira/browse/FLINK-28696
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> This bug was introduced by FLINK-28660. Our newly added logic results in that 
> blocklist listener will not be notified when there are no newly added nodes 
> (only merge nodes) 。
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28696) Notify all newlyAdded/Merged blocked nodes to BlocklistListener

2022-07-26 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28696:
---

Assignee: Lijie Wang

> Notify all newlyAdded/Merged blocked nodes to BlocklistListener
> ---
>
> Key: FLINK-28696
> URL: https://issues.apache.org/jira/browse/FLINK-28696
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> This bug was introduced by FLINK-28660. Our newly added logic results in that 
> blocklist listener will not be notified when there are no newly added nodes 
> (only merge nodes) 。
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28610) Enable speculative execution of sources

2022-07-26 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28610.
---
Resolution: Done

Done via 9192446847b6fb29beb8d36d49d6900de1e61685

> Enable speculative execution of sources
> ---
>
> Key: FLINK-28610
> URL: https://issues.apache.org/jira/browse/FLINK-28610
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Currently speculative execution of sources is disabled. It can be enabled 
> with the improvement done to support InputFormat sources and new sources to 
> work correctly with speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28660) Simplify logs of blocklist

2022-07-25 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28660.
---
Resolution: Fixed

Fixed via 64ad6709f412e80c5e48d24127e7a558bed99e8a

> Simplify logs of blocklist
> --
>
> Key: FLINK-28660
> URL: https://issues.apache.org/jira/browse/FLINK-28660
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28585) Speculative execution for InputFormat sources

2022-07-25 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28585.
---
Resolution: Done

Done via 7611928d0f1a7bb20ec5b0538e0fbe9102a07023

> Speculative execution for InputFormat sources
> -
>
> Key: FLINK-28585
> URL: https://issues.apache.org/jira/browse/FLINK-28585
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> This task enables InputFormat sources for speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28660) Simplify logs of blocklist

2022-07-24 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28660:
---

Assignee: Lijie Wang

> Simplify logs of blocklist
> --
>
> Key: FLINK-28660
> URL: https://issues.apache.org/jira/browse/FLINK-28660
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28640) Let BlocklistDeclarativeSlotPool accept duplicate slot offers

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28640.
---
Resolution: Done

Done via f6a22eaf99d4ba2ef03445bae54a0da7c39c4d1a

> Let BlocklistDeclarativeSlotPool accept duplicate slot offers
> -
>
> Key: FLINK-28640
> URL: https://issues.apache.org/jira/browse/FLINK-28640
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> BlocklistSlotPool should accept a duplicate (already accepted) slot offer, 
> even if it is from a currently blocked task manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28610) Enable speculative execution of sources

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28610:
---

Assignee: Zhu Zhu

> Enable speculative execution of sources
> ---
>
> Key: FLINK-28610
> URL: https://issues.apache.org/jira/browse/FLINK-28610
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Currently speculative execution of sources is disabled. It can be enabled 
> with the improvement done to support InputFormat sources and new sources to 
> work correctly with speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28146) Sync blocklist information between JobMaster & ResourceManager

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28146.
---
Fix Version/s: 1.16.0
   Resolution: Done

Done via 7b05a1b4c9a4ae664fb6b7c4bb85fb3ea6281505

> Sync blocklist information between JobMaster & ResourceManager
> --
>
> Key: FLINK-28146
> URL: https://issues.apache.org/jira/browse/FLINK-28146
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> The newly added/updated blocked nodes should be synchronized between JM and 
> RM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28640) Let BlocklistDeclarativeSlotPool accept duplicate slot offers

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28640:
---

Assignee: Lijie Wang

> Let BlocklistDeclarativeSlotPool accept duplicate slot offers
> -
>
> Key: FLINK-28640
> URL: https://issues.apache.org/jira/browse/FLINK-28640
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> BlocklistSlotPool should accept a duplicate (already accepted) slot offer, 
> even if it is from a currently blocked task manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28586) Speculative execution for new sources

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28586.
---
Resolution: Done

Done via
79d93f2512f6826baefb14c8dc9b59d419d7df0a
9af271f3108ce8af6b6972fabf5420b99e55fc71
bedcc3f7b5c0fc184953d3c1a969f03887db2cae
7129c2ee09ce7eb3959ce88383b5d8ea0987fcf5
863222e926df26fde4caa470c58b261174181719

> Speculative execution for new sources
> -
>
> Key: FLINK-28586
> URL: https://issues.apache.org/jira/browse/FLINK-28586
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> This task enables new sources(FLIP-27) for speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28138) Add metrics for speculative execution

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28138.
---
Resolution: Done

Done via
2173b45b570de8ff1507b8e29a884e2449ffea62
19b0a95c30afd9ed65252c49fb00cef882412553

> Add metrics for speculative execution
> -
>
> Key: FLINK-28138
> URL: https://issues.apache.org/jira/browse/FLINK-28138
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Following two metrics will be added to expose job problems and show the 
> effectiveness of speculative execution:
>  # {*}numSlowExecutionVertices{*}: Number of slow execution vertices at the 
> moment.
>  # {*}numEffectiveSpeculativeExecutions{*}: Number of speculative executions 
> which finish before their corresponding original executions finish.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28145) Let ResourceManager support blocklist mechanism

2022-07-22 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28145.
---
Fix Version/s: 1.16.0
   Resolution: Done

Done via
6f7455b078ba89302b8c05e9d39d3a1ca114700c
9815caad271a561640ffe0df7193c04270d53a25
2e5cac1f31aa571276df20e24889994672692a89

> Let ResourceManager support blocklist mechanism
> ---
>
> Key: FLINK-28145
> URL: https://issues.apache.org/jira/browse/FLINK-28145
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Let ResourceManager support blocklist mechanism:
> 1. SlotManager should filter out blocked resources when allocating registered 
> resources.
> 2. ResourceManagerDriver should avoid allocating task managers from blocked 
> nodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28585) Speculative execution for InputFormat sources

2022-07-21 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28585:
---

Assignee: Zhu Zhu

> Speculative execution for InputFormat sources
> -
>
> Key: FLINK-28585
> URL: https://issues.apache.org/jira/browse/FLINK-28585
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> This task enables InputFormat sources for speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28588) Enhance REST API for Speculative Execution

2022-07-21 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28588:
---

Assignee: Gen Luo

> Enhance REST API for Speculative Execution
> --
>
> Key: FLINK-28588
> URL: https://issues.apache.org/jira/browse/FLINK-28588
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST
>Affects Versions: 1.16.0
>Reporter: Gen Luo
>Assignee: Gen Luo
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28547) Add IT cases for SpeculativeScheduler

2022-07-20 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu updated FLINK-28547:

Component/s: Tests

> Add IT cases for SpeculativeScheduler
> -
>
> Key: FLINK-28547
> URL: https://issues.apache.org/jira/browse/FLINK-28547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Tests
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Add IT cases for SpeculativeScheduler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28547) Add IT cases for SpeculativeScheduler

2022-07-20 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28547.
---
Resolution: Done

Done via fd763672b858e74b24760e5c98ff9af22caa8a14

> Add IT cases for SpeculativeScheduler
> -
>
> Key: FLINK-28547
> URL: https://issues.apache.org/jira/browse/FLINK-28547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Add IT cases for SpeculativeScheduler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28612) Cancel pending slot allocation after canceling executions

2022-07-20 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28612.
---
Resolution: Fixed

Fixed via 3278995372d1ea27b6fd86806e9a860a644694c7

> Cancel pending slot allocation after canceling executions
> -
>
> Key: FLINK-28612
> URL: https://issues.apache.org/jira/browse/FLINK-28612
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Canceling pending slot allocation before canceling executions will result in 
> execution failures  and pollute the logs. It will also result in an execution 
> to be FAILED even if the execution vertex has FINISHED, which breaks the 
> assumption of SpeculativeScheduler#isExecutionVertexPossibleToFinish().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28612) Cancel pending slot allocation after canceling executions

2022-07-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28612:
---

 Summary: Cancel pending slot allocation after canceling executions
 Key: FLINK-28612
 URL: https://issues.apache.org/jira/browse/FLINK-28612
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhu Zhu
 Fix For: 1.16.0


Canceling pending slot allocation before canceling executions will result in 
execution failures  and pollute the logs. It will also result in an execution 
to be FAILED even if the execution vertex has FINISHED, which breaks the 
assumption of SpeculativeScheduler#isExecutionVertexPossibleToFinish().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28612) Cancel pending slot allocation after canceling executions

2022-07-19 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28612:
---

Assignee: Zhu Zhu

> Cancel pending slot allocation after canceling executions
> -
>
> Key: FLINK-28612
> URL: https://issues.apache.org/jira/browse/FLINK-28612
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Canceling pending slot allocation before canceling executions will result in 
> execution failures  and pollute the logs. It will also result in an execution 
> to be FAILED even if the execution vertex has FINISHED, which breaks the 
> assumption of SpeculativeScheduler#isExecutionVertexPossibleToFinish().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28610) Enable speculative execution of sources

2022-07-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28610:
---

 Summary: Enable speculative execution of sources
 Key: FLINK-28610
 URL: https://issues.apache.org/jira/browse/FLINK-28610
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhu Zhu
 Fix For: 1.16.0


Currently speculative execution of sources is disabled. It can be enabled with 
the improvement done to support InputFormat sources and new sources to work 
correctly with speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28586) Speculative execution for new sources

2022-07-18 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28586:
---

Assignee: Zhu Zhu

> Speculative execution for new sources
> -
>
> Key: FLINK-28586
> URL: https://issues.apache.org/jira/browse/FLINK-28586
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> This task enables new sources(FLIP-27) for speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28586) Speculative execution for new sources

2022-07-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28586:
---

 Summary: Speculative execution for new sources
 Key: FLINK-28586
 URL: https://issues.apache.org/jira/browse/FLINK-28586
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Common, Runtime / Coordination
Reporter: Zhu Zhu
 Fix For: 1.16.0


This task enables new sources(FLIP-27) for speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28585) Speculative execution for InputFormat sources

2022-07-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28585:
---

 Summary: Speculative execution for InputFormat sources
 Key: FLINK-28585
 URL: https://issues.apache.org/jira/browse/FLINK-28585
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Zhu Zhu
 Fix For: 1.16.0


This task enables InputFormat sources for speculative execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28547) Add IT cases for SpeculativeScheduler

2022-07-14 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28547:
---

Assignee: Lijie Wang

> Add IT cases for SpeculativeScheduler
> -
>
> Key: FLINK-28547
> URL: https://issues.apache.org/jira/browse/FLINK-28547
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
> Fix For: 1.16.0
>
>
> Add IT cases for SpeculativeScheduler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28137) Introduce SpeculativeScheduler

2022-07-13 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28137.
---
Resolution: Done

Done via
81c739ae462412e531216bb46bc567fce2355dd8
265612c2cf93a589d87d7fc8ca168bc19d838885

> Introduce SpeculativeScheduler
> --
>
> Key: FLINK-28137
> URL: https://issues.apache.org/jira/browse/FLINK-28137
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> A SpeculativeScheduler will be used if speculative execution is enabled. It 
> extends AdaptiveBatchScheduler so that speculative execution can work along 
> with the feature to adaptively tuning parallelisms for batch jobs.
> The major differences of SpeculativeScheduler are:
>  * SpeculativeScheduler needs to be able to directly deploy an Execution, 
> while AdaptiveBatchScheduler can only perform ExecutionVertex level 
> deployment.
>  * SpeculativeScheduler does not restart the ExecutionVertex if an execution 
> fails when any other current execution is still making progress
>  * SpeculativeScheduler listens on slow tasks. Once there are slow tasks, it 
> will block the slow nodes and deploy speculative executions of the slow tasks 
> on other nodes.
>  * Once any execution finishes, SpeculativeScheduler will cancel all the 
> remaining executions of the same execution vertex.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28138) Add metrics for speculative execution

2022-07-13 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28138:
---

Assignee: Zhu Zhu

> Add metrics for speculative execution
> -
>
> Key: FLINK-28138
> URL: https://issues.apache.org/jira/browse/FLINK-28138
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Following two metrics will be added to expose job problems and show the 
> effectiveness of speculative execution:
>  # {*}numSlowExecutionVertices{*}: Number of slow execution vertices at the 
> moment.
>  # {*}numEffectiveSpeculativeExecutions{*}: Number of speculative executions 
> which finish before their corresponding original executions finish.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28402) Create FailureHandlingResultSnapshot with the truly failed execution

2022-07-11 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28402.
---
Resolution: Done

Done via edd10a1d8fd09f8a5e296ecdf0201945d77d5ff2

> Create FailureHandlingResultSnapshot with the truly failed execution
> 
>
> Key: FLINK-28402
> URL: https://issues.apache.org/jira/browse/FLINK-28402
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Previously, FailureHandlingResultSnapshot was always created to treat the 
> only current attempt of an execution vertex as the failed execution. This is 
> no longer right in speculative execution cases, in which an execution vertex 
> can have multiple current executions, and any of them may fail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28144) Let JobMaster support blocklist mechanism

2022-07-11 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28144.
---
Fix Version/s: 1.16.0
   Resolution: Done

Done via:
f2f83e1956eccecaa2371b21bddaf7778bb4f819
04f2f0c2660b312449419a3acb58a46a38d84f64
72ea8b5999bf36125aa5f1a38df4ec52c7a95702
387b2a473d0c0a8d58d1ca0401894dffc0527b31

> Let JobMaster support blocklist mechanism
> -
>
> Key: FLINK-28144
> URL: https://issues.apache.org/jira/browse/FLINK-28144
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Lijie Wang
>Assignee: Lijie Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> SlotPool should avoid allocating slots that located on blocked nodes. To do 
> that, our core idea is to keep the SlotPool in such a state: there is no slot 
> in SlotPool that is free (no task assigned) and located on blocked nodes. 
> Details are as following:
> 1. When receiving slot offers from task managers located on blocked nodes, 
> all offers should be rejected.
> 2. When a node is newly blocked, we should release all free(no task assigned) 
> slots on it. We need to find all task managers on blocked nodes and release 
> all free slots on them by SlotPoolService#releaseFreeSlotsOnTaskManager.
> 3. When a slot state changes from reserved(task assigned) to free(no task 
> assigned), it will check whether the corresponding task manager is blocked. 
> If yes, release the slot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-28402) Create FailureHandlingResultSnapshot with the truly failed execution

2022-07-08 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-28402:
---

Assignee: Zhu Zhu

> Create FailureHandlingResultSnapshot with the truly failed execution
> 
>
> Key: FLINK-28402
> URL: https://issues.apache.org/jira/browse/FLINK-28402
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> Previously, FailureHandlingResultSnapshot was always created to treat the 
> only current attempt of an execution vertex as the failed execution. This is 
> no longer right in speculative execution cases, in which an execution vertex 
> can have multiple current executions, and any of them may fail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-28392) RemoveCachedShuffleDescriptorTest#testRemoveOffloadedCacheForPointwiseEdgeAfterFailover causes fatal error on CI

2022-07-06 Thread Zhu Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu closed FLINK-28392.
---
Resolution: Fixed

Fixed via e5c4e3f519f364b5951e7cac331eb8af48f0ed84

> RemoveCachedShuffleDescriptorTest#testRemoveOffloadedCacheForPointwiseEdgeAfterFailover
>  causes fatal error on CI
> 
>
> Key: FLINK-28392
> URL: https://issues.apache.org/jira/browse/FLINK-28392
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.16.0
>Reporter: Martijn Visser
>Assignee: Zhu Zhu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.16.0
>
>
> {code:java}
> Jul 05 03:30:03 [ERROR] Error occurred in starting fork, check output in log
> Jul 05 03:30:03 [ERROR] Process Exit Code: 239
> Jul 05 03:30:03 [ERROR] Crashed tests:
> Jul 05 03:30:03 [ERROR] 
> org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategyTest
> Jul 05 03:30:03 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> Jul 05 03:30:03 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-runtime && 
> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -XX:+UseG1GC -Xms256m -Xmx768m 
> -jar 
> /__w/1/s/flink-runtime/target/surefire/surefirebooter4932865857415988980.jar 
> /__w/1/s/flink-runtime/target/surefire 2022-07-05T03-23-25_404-jvmRun1 
> surefire8916732512419442726tmp surefire_2130262314165063415tmp
> Jul 05 03:30:03 [ERROR] Error occurred in starting fork, check output in log
> Jul 05 03:30:03 [ERROR] Process Exit Code: 239
> Jul 05 03:30:03 [ERROR] Crashed tests:
> Jul 05 03:30:03 [ERROR] 
> org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategyTest
> Jul 05 03:30:03 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
> Jul 05 03:30:03 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:405)
> Jul 05 03:30:03 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:321)
> Jul 05 03:30:03 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
> Jul 05 03:30:03 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
> Jul 05 03:30:03 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37602=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=7b25afdf-cc6c-566f-5459-359dc2585798=8147



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


<    1   2   3   4   5   6   7   8   9   10   >