[jira] [Closed] (FLINK-28131) FLIP-168: Speculative Execution for Batch Job
[ https://issues.apache.org/jira/browse/FLINK-28131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28131. --- Release Note: Speculative execution(FLIP-168) is introduced in Flink 1.16 to mitigate batch job slowness which is caused by problematic nodes. A problematic node may have hardware problems, accident I/O busy, or high CPU load. These problems may make the hosted tasks run much slower than tasks on other nodes, and affect the overall execution time of a batch job. When speculative execution is enabled, Flink will keep detecting slow tasks. Once slow tasks are detected, the nodes that the slow tasks locate in will be identified as problematic nodes and get blocked via the blocklist mechanism(FLIP-224). The scheduler will create new attempts for the slow tasks and deploy them to nodes that are not blocked, while the existing attempts will keep running. The new attempts process the same input data and produce the same data as the original attempt. Once any attempt finishes first, it will be admitted as the only finished attempt of the task, and the remaining attempts of the task will be canceled. Most existing sources can work with speculative execution(FLIP-245). Only if a source uses SourceEvent, it must implement SupportsHandleExecutionAttemptSourceEvent to support speculative execution. Sinks do not support speculative execution yet so that speculative execution will not happen on sinks at the moment. The Web UI & REST API are also improved(FLIP-249) to display multiple concurrent attempts of tasks and blocked task managers. Resolution: Done > FLIP-168: Speculative Execution for Batch Job > - > > Key: FLINK-28131 > URL: https://issues.apache.org/jira/browse/FLINK-28131 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative executions is helpful to mitigate slow tasks caused by > problematic nodes. The basic idea is to start mirror tasks on other nodes > when a slow task is detected. The mirror task processes the same input data > and produces the same data as the original task. > More detailed can be found in > [FLIP-168|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job].] > > This is the umbrella ticket to track all the changes of this feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (FLINK-28131) FLIP-168: Speculative Execution for Batch Job
[ https://issues.apache.org/jira/browse/FLINK-28131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reopened FLINK-28131: - > FLIP-168: Speculative Execution for Batch Job > - > > Key: FLINK-28131 > URL: https://issues.apache.org/jira/browse/FLINK-28131 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative executions is helpful to mitigate slow tasks caused by > problematic nodes. The basic idea is to start mirror tasks on other nodes > when a slow task is detected. The mirror task processes the same input data > and produces the same data as the original task. > More detailed can be found in > [FLIP-168|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job].] > > This is the umbrella ticket to track all the changes of this feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28981. --- Resolution: Done > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Yunhong Zheng >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > -- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > -- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > -- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17599499#comment-17599499 ] Zhu Zhu commented on FLINK-28981: - Thanks for helping with this release testing! [~337361...@qq.com] > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Yunhong Zheng >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > -- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > -- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > -- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598711#comment-17598711 ] Zhu Zhu commented on FLINK-28980: - Thanks for helping with the release testing! [~SleePy] > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Biao Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > Attachments: flink-root-standalonesession-0-VM_38_195_centos.log, > flink-root-taskexecutor-0-VM_199_24_centos.log, > flink-root-taskexecutor-0-VM_38_195_centos.log, screenshot1, screenshot2, > stdout > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28980. --- Resolution: Done > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Biao Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > Attachments: flink-root-standalonesession-0-VM_38_195_centos.log, > flink-root-taskexecutor-0-VM_199_24_centos.log, > flink-root-taskexecutor-0-VM_38_195_centos.log, screenshot1, screenshot2, > stdout > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28907: Fix Version/s: (was: 1.16.0) > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > > Flink docs fail to compile locally. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28907. --- Resolution: Won't Fix > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs fail to compile locally. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598246#comment-17598246 ] Zhu Zhu commented on FLINK-28907: - Got it! Thanks for the explanation. > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs fail to compile locally. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17598146#comment-17598146 ] Zhu Zhu commented on FLINK-28907: - Thanks for looking into this! [~martijnvisser] Do you mean the process below should happen, but when there is connection issue, it will not happen and will not report errors about the connection problem? > go: downloading github.com/apache/flink-connector-elasticsearch ... > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs fail to compile locally. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28940) Release Testing: Verify FLIP-248 Dynamic Partition Prunning
[ https://issues.apache.org/jira/browse/FLINK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28940. --- Resolution: Done > Release Testing: Verify FLIP-248 Dynamic Partition Prunning > --- > > Key: FLINK-28940 > URL: https://issues.apache.org/jira/browse/FLINK-28940 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner, Table SQL / Runtime >Affects Versions: 1.16.0 >Reporter: godfrey he >Assignee: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > This issue aims to verify FLIP-248: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning > We can verify it in SQL client after we build the flink-dist package. > 1. create a partition table and a non-partition table (only hive connector is > supported now, or we need write a new collector), and then insert some data > 2. show the explain result for a join query, whose one side contains a > partition table and other side is non-partition table with a filter, such as > the example in the FLIP doc: select * from store_returns, date_dim where > sr_returned_date_sk = d_date_sk and d_year = 2000. The explain result should > contain `DynamicFilteringDataCollector` node. We can also verify plan for > the various variants of above query. > 3. execute the above plan and verify the execution result. (the execution > result should be same with the execution plan which disable dynamic filtering > via table.optimizer.dynamic-filtering.enabled) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28940) Release Testing: Verify FLIP-248 Dynamic Partition Prunning
[ https://issues.apache.org/jira/browse/FLINK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597069#comment-17597069 ] Zhu Zhu commented on FLINK-28940: - I have tested it and it looks good to me. I used sql client to do the test, by connecting it to Hive. By running testing jobs, I can see that DPP is taking effect: the topology is modified to have a {{DynamicFilteringDataCollector}} and {{Order-Enforcer}}. By comparing the number of input records of the join operator, I can see that expected number of records are truely filtered out in ahead. The job result is also as expected. > Release Testing: Verify FLIP-248 Dynamic Partition Prunning > --- > > Key: FLINK-28940 > URL: https://issues.apache.org/jira/browse/FLINK-28940 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner, Table SQL / Runtime >Affects Versions: 1.16.0 >Reporter: godfrey he >Assignee: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > This issue aims to verify FLIP-248: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning > We can verify it in SQL client after we build the flink-dist package. > 1. create a partition table and a non-partition table (only hive connector is > supported now, or we need write a new collector), and then insert some data > 2. show the explain result for a join query, whose one side contains a > partition table and other side is non-partition table with a filter, such as > the example in the FLIP doc: select * from store_returns, date_dim where > sr_returned_date_sk = d_date_sk and d_year = 2000. The explain result should > contain `DynamicFilteringDataCollector` node. We can also verify plan for > the various variants of above query. > 3. execute the above plan and verify the execution result. (the execution > result should be same with the execution plan which disable dynamic filtering > via table.optimizer.dynamic-filtering.enabled) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28981: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this [document|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Yunhong Zheng >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > [document|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > -- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > -- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > -- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this [documentation|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Biao Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this [documentation|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Biao Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > [documentation|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28981: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this [document|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Yunhong Zheng >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > [document|https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > -- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > -- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > -- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable
[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28981: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Yunhong Zheng >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/speculative_execution/. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > -- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > -- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > -- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you
[jira] [Commented] (FLINK-28940) Release Testing: Verify FLIP-248 Dynamic Partition Prunning
[ https://issues.apache.org/jira/browse/FLINK-28940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17584162#comment-17584162 ] Zhu Zhu commented on FLINK-28940: - I'm working on it and it may need a bit more time. > Release Testing: Verify FLIP-248 Dynamic Partition Prunning > --- > > Key: FLINK-28940 > URL: https://issues.apache.org/jira/browse/FLINK-28940 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner, Table SQL / Runtime >Affects Versions: 1.16.0 >Reporter: godfrey he >Assignee: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > This issue aims to verify FLIP-248: > https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning > We can verify it in SQL client after we build the flink-dist package. > 1. create a partition table and a non-partition table (only hive connector is > supported now, or we need write a new collector), and then insert some data > 2. show the explain result for a join query, whose one side contains a > partition table and other side is non-partition table with a filter, such as > the example in the FLIP doc: select * from store_returns, date_dim where > sr_returned_date_sk = d_date_sk and d_year = 2000. The explain result should > contain `DynamicFilteringDataCollector` node. We can also verify plan for > the various variants of above query. > 3. execute the above plan and verify the execution result. (the execution > result should be same with the execution plan which disable dynamic filtering > via table.optimizer.dynamic-filtering.enabled) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28139) Add documentation for speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28139. --- Resolution: Done Done via 70d9f6c31b289b6ea284a02fdb1d8cfc1a1a5414 > Add documentation for speculative execution > --- > > Key: FLINK-28139 > URL: https://issues.apache.org/jira/browse/FLINK-28139 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28213) StreamExecutionEnvironment configure method support override pipeline.jars option
[ https://issues.apache.org/jira/browse/FLINK-28213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580685#comment-17580685 ] Zhu Zhu commented on FLINK-28213: - I agree that we should not change the semantics of the {{PublicEvolving}} method {{StreamExecutionEnvironment#configure()}}, from setting to merging. Maybe we can add the user set {{PipelineOptions.JARS}} to {{TableConfig.configuration}} on table environment initialization. Later the table module can add new jars into {{PipelineOptions.JARS}} in {{TableConfig.configuration}}. And finally the {{PipelineOptions.JARS}} in {{TableConfig.configuration}} can just override that in the {{configuration}} in {{StreamExecutionEnvironment}} via {{StreamExecutionEnvironment#configure()}}. This behavior can be explained as the table module is enriching/modifying the user set {{PipelineOptions.JARS}} with program options or job configs, which I think is acceptable, because it's already happening (e.g. ExecutionConfigAccessor#fromProgramOptions(...)). > StreamExecutionEnvironment configure method support override pipeline.jars > option > - > > Key: FLINK-28213 > URL: https://issues.apache.org/jira/browse/FLINK-28213 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Affects Versions: 1.16.0 >Reporter: dalongliu >Assignee: dalongliu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28980: --- Assignee: Biao Liu > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Biao Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28981: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including -- [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] -- [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] -- [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including - [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] - [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] - [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > -- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > -- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > -- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Labels: release-testing (was: test-stability) > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28981: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including - [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] - [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] - [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including - [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] - [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] - [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including > - [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] > - [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] > - [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by
[jira] [Updated] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28981: Labels: release-testing (was: test-stability) > Release Testing: Verify FLIP-245 sources speculative execution > -- > > Key: FLINK-28981 > URL: https://issues.apache.org/jira/browse/FLINK-28981 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Priority: Blocker > Labels: release-testing > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and > FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write Flink jobs which has some {{source}} subtasks running much slower > than others. 3 kinds of sources should be verified, including >- [Source > functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] >- [InputFormat > sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] >- [FLIP-27 new > sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28130) FLIP-224: Blocklist Mechanism
[ https://issues.apache.org/jira/browse/FLINK-28130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28130. --- Resolution: Done > FLIP-224: Blocklist Mechanism > - > > Key: FLINK-28130 > URL: https://issues.apache.org/jira/browse/FLINK-28130 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Fix For: 1.16.0 > > > In order to support speculative execution for batch > jobs([FLIP-168|https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job]), > we need a mechanism to block resources on nodes where the slow tasks are > located. We propose to introduce a blocklist mechanism as follows: Once a > node is marked as blocked, future slots should not be allocated from the > blocked node, but the slots that are already allocated will not be affected. > More details see > [FLIP-224|https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism] > This is the umbrella ticket to track all the changes of this feature. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28587) FLIP-249: Flink Web UI Enhancement for Speculative Execution
[ https://issues.apache.org/jira/browse/FLINK-28587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28587. --- Fix Version/s: 1.16.0 Resolution: Done > FLIP-249: Flink Web UI Enhancement for Speculative Execution > > > Key: FLINK-28587 > URL: https://issues.apache.org/jira/browse/FLINK-28587 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST, Runtime / Web Frontend >Affects Versions: 1.16.0 >Reporter: Gen Luo >Assignee: Gen Luo >Priority: Major > Fix For: 1.16.0 > > > As a follow-up step of FLIP-168 and FLIP-224, the Flink Web UI needs to be > enhanced to display the related information if the speculative execution > mechanism is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28397) [FLIP-245] Source Supports Speculative Execution For Batch Job
[ https://issues.apache.org/jira/browse/FLINK-28397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28397. --- Resolution: Done > [FLIP-245] Source Supports Speculative Execution For Batch Job > -- > > Key: FLINK-28397 > URL: https://issues.apache.org/jira/browse/FLINK-28397 > Project: Flink > Issue Type: New Feature > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Jing Zhang >Assignee: Jing Zhang >Priority: Major > Fix For: 1.16.0 > > > This is the umbrella ticket of > [FLIP-245|https://cwiki.apache.org/confluence/display/FLINK/FLIP-245%3A+Source+Supports+Speculative+Execution+For+Batch+Job]. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28981) Release Testing: Verify FLIP-245 sources speculative execution
Zhu Zhu created FLINK-28981: --- Summary: Release Testing: Verify FLIP-245 sources speculative execution Key: FLINK-28981 URL: https://issues.apache.org/jira/browse/FLINK-28981 Project: Flink Issue Type: Sub-task Components: Connectors / Common, Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-245, along with FLIP-168, FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write Flink jobs which has some {{source}} subtasks running much slower than others. 3 kinds of sources should be verified, including - [Source functions|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SourceFunction.java] - [InputFormat sources|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/InputFormat.java] - [FLIP-27 new sources|https://github.com/apache/flink/blob/master//flink-core/src/main/java/org/apache/flink/api/connector/source/Source.java] - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. More details about this feature and how to use it can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. > This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. > More details about this feature and how to use it can be found in this > documentation [PR|https://github.com/apache/flink/pull/20507]. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. More details about this feature can be found in > this documentation [PR|https://github.com/apache/flink/pull/20507]. > This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. More details about this feature can be found in > this documentation [PR|https://github.com/apache/flink/pull/20507]. > This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28980: Description: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. was: Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. > Release Testing: Verify FLIP-168 speculative execution > -- > > Key: FLINK-28980 > URL: https://issues.apache.org/jira/browse/FLINK-28980 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Speculative execution is introduced in Flink 1.16 to deal with temporary slow > tasks caused by slow nodes. More details about this feature can be found in > this documentation [PR|https://github.com/apache/flink/pull/20507]. > This feature currently consists of 4 FLIPs: > - FLIP-168: Speculative Execution core part > - FLIP-224: Blocklist Mechanism > - FLIP-245: Source Supports Speculative Execution > - FLIP-249: Flink Web UI Enhancement for Speculative Execution > This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. > To do the verification, the process can be: > - Write a Flink job which has a subtask running much slower than others > (e.g. sleep indefinitely if it runs on a certain host, the hostname can be > retrieved via InetAddress.getLocalHost().getHostName(), or if its > (subtaskIndex + attemptNumer) % 2 == 0) > - Modify Flink configuration file to enable speculative execution and tune > the configuration as you like > - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28980) Release Testing: Verify FLIP-168 speculative execution
Zhu Zhu created FLINK-28980: --- Summary: Release Testing: Verify FLIP-168 speculative execution Key: FLINK-28980 URL: https://issues.apache.org/jira/browse/FLINK-28980 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 Speculative execution is introduced in Flink 1.16 to deal with temporary slow tasks caused by slow nodes. More details about this feature can be found in this documentation [PR|https://github.com/apache/flink/pull/20507]. This feature currently consists of 4 FLIPs: - FLIP-168: Speculative Execution core part - FLIP-224: Blocklist Mechanism - FLIP-245: Source Supports Speculative Execution - FLIP-249: Flink Web UI Enhancement for Speculative Execution This ticket aims to verify FLIP-168, along with FLIP-224 and FLIP-249. To do the verification, the process can be: - Write a Flink job which has a subtask running much slower than others (e.g. sleep indefinitely if it runs on a certain host, the hostname can be retrieved via InetAddress.getLocalHost().getHostName(), or if its (subtaskIndex + attemptNumer) % 2 == 0) - Modify Flink configuration file to enable speculative execution and tune the configuration as you like - Submit the job. Checking the web UI, logs, metrics and produced result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579512#comment-17579512 ] Zhu Zhu edited comment on FLINK-28878 at 8/15/22 3:44 AM: -- Fixed via master: 5f8f387cba774a2c3900ea38e8a3dad017cf1790 release-1.15: ded03b750f46d8636d6744d4e094943d04f787dd was (Author: zhuzh): Fixed via 5f8f387cba774a2c3900ea38e8a3dad017cf1790 > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.14.5, 1.15.1, 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28878: Fix Version/s: 1.15.2 > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.14.5, 1.15.1, 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.16.0, 1.15.2 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28878. --- Resolution: Fixed Fixed via 5f8f387cba774a2c3900ea38e8a3dad017cf1790 > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28878: Component/s: Tests (was: Runtime / Coordination) > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.15.1, 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28878: Affects Version/s: 1.14.5 > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.14.5, 1.15.1, 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28878: Affects Version/s: 1.15.1 > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.1, 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28766) UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException
[ https://issues.apache.org/jira/browse/FLINK-28766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578887#comment-17578887 ] Zhu Zhu commented on FLINK-28766: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39908=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7 > UnalignedCheckpointStressITCase.runStressTest failed with NoSuchFileException > - > > Key: FLINK-28766 > URL: https://issues.apache.org/jira/browse/FLINK-28766 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.0 >Reporter: Huang Xingbo >Priority: Critical > Labels: test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-01T01:36:16.0563880Z Aug 01 01:36:16 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest > Time elapsed: 12.579 s <<< ERROR! > 2022-08-01T01:36:16.0565407Z Aug 01 01:36:16 java.io.UncheckedIOException: > java.nio.file.NoSuchFileException: > /tmp/junit1058240190382532303/f0f99754a53d2c4633fed75011da58dd/chk-7/61092e4a-5b9a-4f56-83f7-d9960c53ed3e > 2022-08-01T01:36:16.0566296Z Aug 01 01:36:16 at > java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88) > 2022-08-01T01:36:16.0566972Z Aug 01 01:36:16 at > java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104) > 2022-08-01T01:36:16.0567600Z Aug 01 01:36:16 at > java.util.Iterator.forEachRemaining(Iterator.java:115) > 2022-08-01T01:36:16.0568290Z Aug 01 01:36:16 at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > 2022-08-01T01:36:16.0569172Z Aug 01 01:36:16 at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > 2022-08-01T01:36:16.0569877Z Aug 01 01:36:16 at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > 2022-08-01T01:36:16.0570554Z Aug 01 01:36:16 at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > 2022-08-01T01:36:16.0571371Z Aug 01 01:36:16 at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > 2022-08-01T01:36:16.0572417Z Aug 01 01:36:16 at > java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546) > 2022-08-01T01:36:16.0573618Z Aug 01 01:36:16 at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:289) > 2022-08-01T01:36:16.0575187Z Aug 01 01:36:16 at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:262) > 2022-08-01T01:36:16.0576540Z Aug 01 01:36:16 at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:158) > 2022-08-01T01:36:16.0577684Z Aug 01 01:36:16 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2022-08-01T01:36:16.0578546Z Aug 01 01:36:16 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2022-08-01T01:36:16.0579374Z Aug 01 01:36:16 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2022-08-01T01:36:16.0580298Z Aug 01 01:36:16 at > java.lang.reflect.Method.invoke(Method.java:498) > 2022-08-01T01:36:16.0581243Z Aug 01 01:36:16 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 2022-08-01T01:36:16.0582029Z Aug 01 01:36:16 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2022-08-01T01:36:16.0582766Z Aug 01 01:36:16 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 2022-08-01T01:36:16.0583488Z Aug 01 01:36:16 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2022-08-01T01:36:16.0584203Z Aug 01 01:36:16 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 2022-08-01T01:36:16.0585087Z Aug 01 01:36:16 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 2022-08-01T01:36:16.0585778Z Aug 01 01:36:16 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-08-01T01:36:16.0586482Z Aug 01 01:36:16 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 2022-08-01T01:36:16.0587155Z Aug 01 01:36:16 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > 2022-08-01T01:36:16.0587809Z Aug 01 01:36:16 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > 2022-08-01T01:36:16.0588434Z Aug 01 01:36:16 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > 2022-08-01T01:36:16.0589203Z Aug 01 01:36:16 at
[jira] [Assigned] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28878: --- Assignee: Zhu Zhu > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: test-stability > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28878: Fix Version/s: 1.16.0 > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Huang Xingbo >Assignee: Zhu Zhu >Priority: Major > Labels: test-stability > Fix For: 1.16.0 > > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28878) PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with AssertionError
[ https://issues.apache.org/jira/browse/FLINK-28878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578780#comment-17578780 ] Zhu Zhu commented on FLINK-28878: - Thanks for reporting it! [~hxbks2ks] The test fails due to an unexpected slowness of test running (may be due to an environment slowness). The slowness resulted in a slow request timeout and triggered a failover. This made the job failover number to be 2 instead of the expected 1. Will increase the slot request timeout to make the tests more stable. > PipelinedRegionSchedulingITCase.testRecoverFromPartitionException failed with > AssertionError > > > Key: FLINK-28878 > URL: https://issues.apache.org/jira/browse/FLINK-28878 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Huang Xingbo >Priority: Major > Labels: test-stability > > {code:java} > 2022-08-08T20:38:43.3934646Z Aug 08 20:38:43 [ERROR] > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException > Time elapsed: 20.288 s <<< FAILURE! > 2022-08-08T20:38:43.3935309Z Aug 08 20:38:43 java.lang.AssertionError: > 2022-08-08T20:38:43.3937070Z Aug 08 20:38:43 > 2022-08-08T20:38:43.3938015Z Aug 08 20:38:43 Expected: is > 2022-08-08T20:38:43.3940277Z Aug 08 20:38:43 but: was > 2022-08-08T20:38:43.3940927Z Aug 08 20:38:43 at > org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > 2022-08-08T20:38:43.3941571Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:964) > 2022-08-08T20:38:43.3942120Z Aug 08 20:38:43 at > org.junit.Assert.assertThat(Assert.java:930) > 2022-08-08T20:38:43.3943202Z Aug 08 20:38:43 at > org.apache.flink.test.scheduling.PipelinedRegionSchedulingITCase.testRecoverFromPartitionException(PipelinedRegionSchedulingITCase.java:98) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39652=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7=9994 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28907: Description: Flink docs fail to compile locally. The error is as below: go: github.com/apache/flink-connector-elasticsearch/docs upgrade => v0.0.0-20220715033920-cbeb08187b3a hugo: collected modules in 1832 ms Start building sites … ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not found ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/datastream/elasticsearch": "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead. ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page not found ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/datastream/elasticsearch": "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not found ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead. Built in 6415 ms Error: Error building site: logged 6 error(s) was: Flink docs suddenly fail to compile in my local environment, without no new change or rebase. The error is as below: go: github.com/apache/flink-connector-elasticsearch/docs upgrade => v0.0.0-20220715033920-cbeb08187b3a hugo: collected modules in 1832 ms Start building sites … ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not found ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/datastream/elasticsearch": "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead. ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page not found ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/datastream/elasticsearch": "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not found ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead. Built in 6415 ms Error: Error building site: logged 6 error(s) > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs fail to compile locally. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": >
[jira] [Commented] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577905#comment-17577905 ] Zhu Zhu commented on FLINK-28907: - Tried building docs of latest Flink master on other dev machines, this problem does not happen. So possibly this is not a common problem. > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs suddenly fail to compile in my local environment, without no new > change or rebase. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28907) Flink docs do not compile locally
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28907: Summary: Flink docs do not compile locally (was: Flink docs do not compile) > Flink docs do not compile locally > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs suddenly fail to compile in my local environment, without no new > change or rebase. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28907) Flink docs do not compile
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28907: Priority: Major (was: Blocker) > Flink docs do not compile > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Flink docs suddenly fail to compile in my local environment, without no new > change or rebase. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28907) Flink docs do not compile
[ https://issues.apache.org/jira/browse/FLINK-28907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28907: Summary: Flink docs do not compile (was: Flink docs does not compile) > Flink docs do not compile > - > > Key: FLINK-28907 > URL: https://issues.apache.org/jira/browse/FLINK-28907 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Priority: Blocker > Fix For: 1.16.0 > > > Flink docs suddenly fail to compile in my local environment, without no new > change or rebase. The error is as below: > go: github.com/apache/flink-connector-elasticsearch/docs upgrade => > v0.0.0-20220715033920-cbeb08187b3a > hugo: collected modules in 1832 ms > Start building sites … > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not > found > ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' > instead. > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page > not found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/datastream/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not > found > ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref > "docs/connectors/table/elasticsearch": > "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found > WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' > instead. > Built in 6415 ms > Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28907) Flink docs does not compile
Zhu Zhu created FLINK-28907: --- Summary: Flink docs does not compile Key: FLINK-28907 URL: https://issues.apache.org/jira/browse/FLINK-28907 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.16.0 Reporter: Zhu Zhu Fix For: 1.16.0 Flink docs suddenly fail to compile in my local environment, without no new change or rebase. The error is as below: go: github.com/apache/flink-connector-elasticsearch/docs upgrade => v0.0.0-20220715033920-cbeb08187b3a hugo: collected modules in 1832 ms Start building sites … ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not found ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/datastream/elasticsearch": "/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead. ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page not found ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/datastream/elasticsearch": "/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not found ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref "docs/connectors/table/elasticsearch": "/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead. Built in 6415 ms Error: Error building site: logged 6 error(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28873) Make "jobmanager.scheduler" visible in documentation
[ https://issues.apache.org/jira/browse/FLINK-28873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28873. --- Resolution: Fixed Fixed via 15f87d4a470e9bf29fd18874c26c4506ea57c09f > Make "jobmanager.scheduler" visible in documentation > > > Key: FLINK-28873 > URL: https://issues.apache.org/jira/browse/FLINK-28873 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Currently, the option {{jobmanager.scheduler}} is still excluded from > documentation. But in fact, this option is already used as a public interface > (this option needs to be configured by users when using AdaptiveScheduler and > AdaptiveBatchScheduler). > We should remove the {{ExcludeFromDocumentation}} to make it visible in the > documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28873) Make "jobmanager.scheduler" visible in documentation
[ https://issues.apache.org/jira/browse/FLINK-28873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28873: --- Assignee: Lijie Wang > Make "jobmanager.scheduler" visible in documentation > > > Key: FLINK-28873 > URL: https://issues.apache.org/jira/browse/FLINK-28873 > Project: Flink > Issue Type: Improvement > Components: Runtime / Configuration >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Currently, the option {{jobmanager.scheduler}} is still excluded from > documentation. But in fact, this option is already used as a public interface > (this option needs to be configured by users when using AdaptiveScheduler and > AdaptiveBatchScheduler). > We should remove the {{ExcludeFromDocumentation}} to make it visible in the > documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28324) JUnit5 Migration] Module: flink-sql-client
[ https://issues.apache.org/jira/browse/FLINK-28324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28324: Priority: Major (was: Minor) > JUnit5 Migration] Module: flink-sql-client > -- > > Key: FLINK-28324 > URL: https://issues.apache.org/jira/browse/FLINK-28324 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Reporter: zhouli >Assignee: zhouli >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] (FLINK-28324) JUnit5 Migration] Module: flink-sql-client
[ https://issues.apache.org/jira/browse/FLINK-28324 ] Zhu Zhu deleted comment on FLINK-28324: - was (Author: flink-jira-bot): I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issue is assigned but has not received an update in 30 days, so it has been labeled "stale-assigned". If you are still working on the issue, please remove the label and add a comment updating the community on your progress. If this issue is waiting on feedback, please consider this a reminder to the committer/reviewer. Flink is a very active project, and so we appreciate your patience. If you are no longer working on the issue, please unassign yourself so someone else may work on it. > JUnit5 Migration] Module: flink-sql-client > -- > > Key: FLINK-28324 > URL: https://issues.apache.org/jira/browse/FLINK-28324 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Reporter: zhouli >Assignee: zhouli >Priority: Minor > Labels: pull-request-available, stale-assigned > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28324) JUnit5 Migration] Module: flink-sql-client
[ https://issues.apache.org/jira/browse/FLINK-28324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28324: Labels: pull-request-available (was: pull-request-available stale-assigned) > JUnit5 Migration] Module: flink-sql-client > -- > > Key: FLINK-28324 > URL: https://issues.apache.org/jira/browse/FLINK-28324 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Client >Reporter: zhouli >Assignee: zhouli >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28139) Add documentation for speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28139: --- Assignee: Zhu Zhu > Add documentation for speculative execution > --- > > Key: FLINK-28139 > URL: https://issues.apache.org/jira/browse/FLINK-28139 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-27710) Improve logs to better display Execution
[ https://issues.apache.org/jira/browse/FLINK-27710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-27710: Fix Version/s: (was: 1.16.0) > Improve logs to better display Execution > > > Key: FLINK-27710 > URL: https://issues.apache.org/jira/browse/FLINK-27710 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination, Runtime / Task >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available, stale-assigned > > Currently, an execution is usually represented as "{{{}job vertex name{}}} > ({{{}subtaskIndex+1{}}}/{{{}vertex parallelism{}}}) ({{{}attemptId{}}})" in > logs, which may be redundant after this refactoring work. With the change of > FLINK-17295, the representation of Execution in logs will be redundant. e.g. > the subtask index is displayed 2 times. > Therefore, I'm proposing to change the format to be "<{{{}job vertex name> > {{(<{{{}subtaskIndex>+1{}}}/<{{{}vertex parallelism>{}}}) > {{#}} (graph: <{{{}short ExecutionGraphID>, vertex: > <{}}}{{{}JobVertexID>{}}}) " and avoid directly display the > {{{}ExecutionAttemptID{}}}. This can increase the log readability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28769) Flink History Server show wrong name of batch jobs
[ https://issues.apache.org/jira/browse/FLINK-28769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575001#comment-17575001 ] Zhu Zhu commented on FLINK-28769: - Loos to me the the specified name does work after checking the code of {{ExecutionEnvironment#executeAsync(java.lang.String)}}. I guess your batch job is not set with a "output" param so that the job is created via {{counts.print()}} instead of {{env.execute(...)}}. This is a known limitation of DataSet and it's less likely we will change the interface of {{counts.print()}} because {{DataSet}} will be deprecated soon. > Flink History Server show wrong name of batch jobs > -- > > Key: FLINK-28769 > URL: https://issues.apache.org/jira/browse/FLINK-28769 > Project: Flink > Issue Type: Bug > Components: API / DataSet >Reporter: Biao Geng >Priority: Minor > Attachments: image-2022-08-02-00-41-51-815.png > > > When running {{examples/batch/WordCount.jar}} using flink1.15 and 1.16 > together with history server started, the history server shows default > name(e.g. Flink Java Job at Tue Aug 02.. ) of the batch job instead of the > name( "WordCount Example" ) specified in the java code. > But for {{examples/streaming/WordCount.jar}}, the job name in history server > is correct. > It looks like that > {{org.apache.flink.api.java.ExecutionEnvironment#executeAsync(java.lang.String)}} > does not set job name as what > {{org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#execute(java.lang.String)}} > does(e.g. streamGraph.setJobName(jobName); ). > !image-2022-08-02-00-41-51-815.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-23174) Log improvement in Task throws Error
[ https://issues.apache.org/jira/browse/FLINK-23174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-23174. --- Resolution: Won't Fix Close due to inactive and no response. > Log improvement in Task throws Error > > > Key: FLINK-23174 > URL: https://issues.apache.org/jira/browse/FLINK-23174 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination, Runtime / Network >Affects Versions: 1.13.1 >Reporter: Bo Cui >Assignee: Bo Cui >Priority: Not a Priority > Labels: pull-request-available, stale-assigned > > we met some channels close due to network jitter and task fail. > we can only see which remote channel causes the task/job failure. > but we can not know more details, such as which channel close, task stack... -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28771) Assign speculative execution attempt with correct CREATED timestamp
[ https://issues.apache.org/jira/browse/FLINK-28771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28771. --- Resolution: Fixed Fixed via e1a74df4427e99f4b0f3aaa4e8f4f5ff7cbd044e > Assign speculative execution attempt with correct CREATED timestamp > --- > > Key: FLINK-28771 > URL: https://issues.apache.org/jira/browse/FLINK-28771 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Currently, newly created speculative execution attempt is assigned with a > wrong CREATED timestamp in SpeculativeScheduler. We need to fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
[ https://issues.apache.org/jira/browse/FLINK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28759. --- Resolution: Done Done via 16b0cc1117d4bda11b89440e962646024b2ff6c5 > Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests > --- > > Key: FLINK-28759 > URL: https://issues.apache.org/jira/browse/FLINK-28759 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Tests >Reporter: Zhu Zhu >Assignee: JUNRUILi >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > To verify the correctness of speculative execution, we can enabled it in > AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch > jobs and verifies the result. > Note that we need to disable the blocklist (by setting block duration to 0) > in such single machine e2e tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28589) Enhance Web UI for Speculative Execution
[ https://issues.apache.org/jira/browse/FLINK-28589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28589. --- Fix Version/s: 1.16.0 Resolution: Done Done via 4af75011fe32506a70076a2fe9e847cef39587bb 57990c332f3d87e4bcc1824973ac5ed2bcafec85 f6c5dc1b32ad6e6b524e549eff2b7d9d2b7d9970 > Enhance Web UI for Speculative Execution > > > Key: FLINK-28589 > URL: https://issues.apache.org/jira/browse/FLINK-28589 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Web Frontend >Affects Versions: 1.16.0 >Reporter: Gen Luo >Assignee: Junhan Yang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28771) Assign speculative execution attempt with correct CREATED timestamp
Zhu Zhu created FLINK-28771: --- Summary: Assign speculative execution attempt with correct CREATED timestamp Key: FLINK-28771 URL: https://issues.apache.org/jira/browse/FLINK-28771 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.16.0 Reporter: Zhu Zhu Assignee: Zhu Zhu Fix For: 1.16.0 Currently, newly created speculative execution attempt is assigned with a wrong CREATED timestamp in SpeculativeScheduler. We need to fix it. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
[ https://issues.apache.org/jira/browse/FLINK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17573568#comment-17573568 ] Zhu Zhu commented on FLINK-28759: - [~JunRuiLi] I have assign you the ticket. Go a head to open a PR for it. > Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests > --- > > Key: FLINK-28759 > URL: https://issues.apache.org/jira/browse/FLINK-28759 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Tests >Reporter: Zhu Zhu >Assignee: JUNRUILi >Priority: Major > Fix For: 1.16.0 > > > To verify the correctness of speculative execution, we can enabled it in > AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch > jobs and verifies the result. > Note that we need to disable the blocklist (by setting block duration to 0) > in such single machine e2e tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
[ https://issues.apache.org/jira/browse/FLINK-28759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28759: --- Assignee: JUNRUILi > Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests > --- > > Key: FLINK-28759 > URL: https://issues.apache.org/jira/browse/FLINK-28759 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Tests >Reporter: Zhu Zhu >Assignee: JUNRUILi >Priority: Major > Fix For: 1.16.0 > > > To verify the correctness of speculative execution, we can enabled it in > AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch > jobs and verifies the result. > Note that we need to disable the blocklist (by setting block duration to 0) > in such single machine e2e tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28759) Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests
Zhu Zhu created FLINK-28759: --- Summary: Enable speculative execution for in AdaptiveBatchScheduler TPC-DS e2e tests Key: FLINK-28759 URL: https://issues.apache.org/jira/browse/FLINK-28759 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination, Tests Reporter: Zhu Zhu Fix For: 1.16.0 To verify the correctness of speculative execution, we can enabled it in AdaptiveBatchScheduler TPC-DS e2e tests, which runs a lot of different batch jobs and verifies the result. Note that we need to disable the blocklist (by setting block duration to 0) in such single machine e2e tests. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28696) Notify all newlyAdded/Merged blocked nodes to BlocklistListener
[ https://issues.apache.org/jira/browse/FLINK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28696. --- Resolution: Fixed Fixed via bbaeb628f48a4bc4c324bfc4afd06ddf34f546f0 > Notify all newlyAdded/Merged blocked nodes to BlocklistListener > --- > > Key: FLINK-28696 > URL: https://issues.apache.org/jira/browse/FLINK-28696 > Project: Flink > Issue Type: Sub-task >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > This bug was introduced by FLINK-28660. Our newly added logic results in that > blocklist listener will not be notified when there are no newly added nodes > (only merge nodes) 。 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28696) Notify all newlyAdded/Merged blocked nodes to BlocklistListener
[ https://issues.apache.org/jira/browse/FLINK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28696: --- Assignee: Lijie Wang > Notify all newlyAdded/Merged blocked nodes to BlocklistListener > --- > > Key: FLINK-28696 > URL: https://issues.apache.org/jira/browse/FLINK-28696 > Project: Flink > Issue Type: Sub-task >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > This bug was introduced by FLINK-28660. Our newly added logic results in that > blocklist listener will not be notified when there are no newly added nodes > (only merge nodes) 。 > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28610) Enable speculative execution of sources
[ https://issues.apache.org/jira/browse/FLINK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28610. --- Resolution: Done Done via 9192446847b6fb29beb8d36d49d6900de1e61685 > Enable speculative execution of sources > --- > > Key: FLINK-28610 > URL: https://issues.apache.org/jira/browse/FLINK-28610 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Currently speculative execution of sources is disabled. It can be enabled > with the improvement done to support InputFormat sources and new sources to > work correctly with speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28660) Simplify logs of blocklist
[ https://issues.apache.org/jira/browse/FLINK-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28660. --- Resolution: Fixed Fixed via 64ad6709f412e80c5e48d24127e7a558bed99e8a > Simplify logs of blocklist > -- > > Key: FLINK-28660 > URL: https://issues.apache.org/jira/browse/FLINK-28660 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28585) Speculative execution for InputFormat sources
[ https://issues.apache.org/jira/browse/FLINK-28585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28585. --- Resolution: Done Done via 7611928d0f1a7bb20ec5b0538e0fbe9102a07023 > Speculative execution for InputFormat sources > - > > Key: FLINK-28585 > URL: https://issues.apache.org/jira/browse/FLINK-28585 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > This task enables InputFormat sources for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28660) Simplify logs of blocklist
[ https://issues.apache.org/jira/browse/FLINK-28660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28660: --- Assignee: Lijie Wang > Simplify logs of blocklist > -- > > Key: FLINK-28660 > URL: https://issues.apache.org/jira/browse/FLINK-28660 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28640) Let BlocklistDeclarativeSlotPool accept duplicate slot offers
[ https://issues.apache.org/jira/browse/FLINK-28640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28640. --- Resolution: Done Done via f6a22eaf99d4ba2ef03445bae54a0da7c39c4d1a > Let BlocklistDeclarativeSlotPool accept duplicate slot offers > - > > Key: FLINK-28640 > URL: https://issues.apache.org/jira/browse/FLINK-28640 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > BlocklistSlotPool should accept a duplicate (already accepted) slot offer, > even if it is from a currently blocked task manager -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28610) Enable speculative execution of sources
[ https://issues.apache.org/jira/browse/FLINK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28610: --- Assignee: Zhu Zhu > Enable speculative execution of sources > --- > > Key: FLINK-28610 > URL: https://issues.apache.org/jira/browse/FLINK-28610 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Currently speculative execution of sources is disabled. It can be enabled > with the improvement done to support InputFormat sources and new sources to > work correctly with speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28146) Sync blocklist information between JobMaster & ResourceManager
[ https://issues.apache.org/jira/browse/FLINK-28146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28146. --- Fix Version/s: 1.16.0 Resolution: Done Done via 7b05a1b4c9a4ae664fb6b7c4bb85fb3ea6281505 > Sync blocklist information between JobMaster & ResourceManager > -- > > Key: FLINK-28146 > URL: https://issues.apache.org/jira/browse/FLINK-28146 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > The newly added/updated blocked nodes should be synchronized between JM and > RM. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28640) Let BlocklistDeclarativeSlotPool accept duplicate slot offers
[ https://issues.apache.org/jira/browse/FLINK-28640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28640: --- Assignee: Lijie Wang > Let BlocklistDeclarativeSlotPool accept duplicate slot offers > - > > Key: FLINK-28640 > URL: https://issues.apache.org/jira/browse/FLINK-28640 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > BlocklistSlotPool should accept a duplicate (already accepted) slot offer, > even if it is from a currently blocked task manager -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28586) Speculative execution for new sources
[ https://issues.apache.org/jira/browse/FLINK-28586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28586. --- Resolution: Done Done via 79d93f2512f6826baefb14c8dc9b59d419d7df0a 9af271f3108ce8af6b6972fabf5420b99e55fc71 bedcc3f7b5c0fc184953d3c1a969f03887db2cae 7129c2ee09ce7eb3959ce88383b5d8ea0987fcf5 863222e926df26fde4caa470c58b261174181719 > Speculative execution for new sources > - > > Key: FLINK-28586 > URL: https://issues.apache.org/jira/browse/FLINK-28586 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > This task enables new sources(FLIP-27) for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28138) Add metrics for speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28138. --- Resolution: Done Done via 2173b45b570de8ff1507b8e29a884e2449ffea62 19b0a95c30afd9ed65252c49fb00cef882412553 > Add metrics for speculative execution > - > > Key: FLINK-28138 > URL: https://issues.apache.org/jira/browse/FLINK-28138 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Following two metrics will be added to expose job problems and show the > effectiveness of speculative execution: > # {*}numSlowExecutionVertices{*}: Number of slow execution vertices at the > moment. > # {*}numEffectiveSpeculativeExecutions{*}: Number of speculative executions > which finish before their corresponding original executions finish. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28145) Let ResourceManager support blocklist mechanism
[ https://issues.apache.org/jira/browse/FLINK-28145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28145. --- Fix Version/s: 1.16.0 Resolution: Done Done via 6f7455b078ba89302b8c05e9d39d3a1ca114700c 9815caad271a561640ffe0df7193c04270d53a25 2e5cac1f31aa571276df20e24889994672692a89 > Let ResourceManager support blocklist mechanism > --- > > Key: FLINK-28145 > URL: https://issues.apache.org/jira/browse/FLINK-28145 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Let ResourceManager support blocklist mechanism: > 1. SlotManager should filter out blocked resources when allocating registered > resources. > 2. ResourceManagerDriver should avoid allocating task managers from blocked > nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28585) Speculative execution for InputFormat sources
[ https://issues.apache.org/jira/browse/FLINK-28585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28585: --- Assignee: Zhu Zhu > Speculative execution for InputFormat sources > - > > Key: FLINK-28585 > URL: https://issues.apache.org/jira/browse/FLINK-28585 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > This task enables InputFormat sources for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28588) Enhance REST API for Speculative Execution
[ https://issues.apache.org/jira/browse/FLINK-28588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28588: --- Assignee: Gen Luo > Enhance REST API for Speculative Execution > -- > > Key: FLINK-28588 > URL: https://issues.apache.org/jira/browse/FLINK-28588 > Project: Flink > Issue Type: Sub-task > Components: Runtime / REST >Affects Versions: 1.16.0 >Reporter: Gen Luo >Assignee: Gen Luo >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28547) Add IT cases for SpeculativeScheduler
[ https://issues.apache.org/jira/browse/FLINK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu updated FLINK-28547: Component/s: Tests > Add IT cases for SpeculativeScheduler > - > > Key: FLINK-28547 > URL: https://issues.apache.org/jira/browse/FLINK-28547 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination, Tests >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Add IT cases for SpeculativeScheduler. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28547) Add IT cases for SpeculativeScheduler
[ https://issues.apache.org/jira/browse/FLINK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28547. --- Resolution: Done Done via fd763672b858e74b24760e5c98ff9af22caa8a14 > Add IT cases for SpeculativeScheduler > - > > Key: FLINK-28547 > URL: https://issues.apache.org/jira/browse/FLINK-28547 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Add IT cases for SpeculativeScheduler. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28612) Cancel pending slot allocation after canceling executions
[ https://issues.apache.org/jira/browse/FLINK-28612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28612. --- Resolution: Fixed Fixed via 3278995372d1ea27b6fd86806e9a860a644694c7 > Cancel pending slot allocation after canceling executions > - > > Key: FLINK-28612 > URL: https://issues.apache.org/jira/browse/FLINK-28612 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Canceling pending slot allocation before canceling executions will result in > execution failures and pollute the logs. It will also result in an execution > to be FAILED even if the execution vertex has FINISHED, which breaks the > assumption of SpeculativeScheduler#isExecutionVertexPossibleToFinish(). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28612) Cancel pending slot allocation after canceling executions
Zhu Zhu created FLINK-28612: --- Summary: Cancel pending slot allocation after canceling executions Key: FLINK-28612 URL: https://issues.apache.org/jira/browse/FLINK-28612 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 Canceling pending slot allocation before canceling executions will result in execution failures and pollute the logs. It will also result in an execution to be FAILED even if the execution vertex has FINISHED, which breaks the assumption of SpeculativeScheduler#isExecutionVertexPossibleToFinish(). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28612) Cancel pending slot allocation after canceling executions
[ https://issues.apache.org/jira/browse/FLINK-28612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28612: --- Assignee: Zhu Zhu > Cancel pending slot allocation after canceling executions > - > > Key: FLINK-28612 > URL: https://issues.apache.org/jira/browse/FLINK-28612 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Canceling pending slot allocation before canceling executions will result in > execution failures and pollute the logs. It will also result in an execution > to be FAILED even if the execution vertex has FINISHED, which breaks the > assumption of SpeculativeScheduler#isExecutionVertexPossibleToFinish(). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28610) Enable speculative execution of sources
Zhu Zhu created FLINK-28610: --- Summary: Enable speculative execution of sources Key: FLINK-28610 URL: https://issues.apache.org/jira/browse/FLINK-28610 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 Currently speculative execution of sources is disabled. It can be enabled with the improvement done to support InputFormat sources and new sources to work correctly with speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28586) Speculative execution for new sources
[ https://issues.apache.org/jira/browse/FLINK-28586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28586: --- Assignee: Zhu Zhu > Speculative execution for new sources > - > > Key: FLINK-28586 > URL: https://issues.apache.org/jira/browse/FLINK-28586 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > This task enables new sources(FLIP-27) for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28586) Speculative execution for new sources
Zhu Zhu created FLINK-28586: --- Summary: Speculative execution for new sources Key: FLINK-28586 URL: https://issues.apache.org/jira/browse/FLINK-28586 Project: Flink Issue Type: Sub-task Components: Connectors / Common, Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 This task enables new sources(FLIP-27) for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-28585) Speculative execution for InputFormat sources
Zhu Zhu created FLINK-28585: --- Summary: Speculative execution for InputFormat sources Key: FLINK-28585 URL: https://issues.apache.org/jira/browse/FLINK-28585 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zhu Zhu Fix For: 1.16.0 This task enables InputFormat sources for speculative execution. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28547) Add IT cases for SpeculativeScheduler
[ https://issues.apache.org/jira/browse/FLINK-28547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28547: --- Assignee: Lijie Wang > Add IT cases for SpeculativeScheduler > - > > Key: FLINK-28547 > URL: https://issues.apache.org/jira/browse/FLINK-28547 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Fix For: 1.16.0 > > > Add IT cases for SpeculativeScheduler. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28137) Introduce SpeculativeScheduler
[ https://issues.apache.org/jira/browse/FLINK-28137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28137. --- Resolution: Done Done via 81c739ae462412e531216bb46bc567fce2355dd8 265612c2cf93a589d87d7fc8ca168bc19d838885 > Introduce SpeculativeScheduler > -- > > Key: FLINK-28137 > URL: https://issues.apache.org/jira/browse/FLINK-28137 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > A SpeculativeScheduler will be used if speculative execution is enabled. It > extends AdaptiveBatchScheduler so that speculative execution can work along > with the feature to adaptively tuning parallelisms for batch jobs. > The major differences of SpeculativeScheduler are: > * SpeculativeScheduler needs to be able to directly deploy an Execution, > while AdaptiveBatchScheduler can only perform ExecutionVertex level > deployment. > * SpeculativeScheduler does not restart the ExecutionVertex if an execution > fails when any other current execution is still making progress > * SpeculativeScheduler listens on slow tasks. Once there are slow tasks, it > will block the slow nodes and deploy speculative executions of the slow tasks > on other nodes. > * Once any execution finishes, SpeculativeScheduler will cancel all the > remaining executions of the same execution vertex. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28138) Add metrics for speculative execution
[ https://issues.apache.org/jira/browse/FLINK-28138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28138: --- Assignee: Zhu Zhu > Add metrics for speculative execution > - > > Key: FLINK-28138 > URL: https://issues.apache.org/jira/browse/FLINK-28138 > Project: Flink > Issue Type: Sub-task > Components: Documentation >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Fix For: 1.16.0 > > > Following two metrics will be added to expose job problems and show the > effectiveness of speculative execution: > # {*}numSlowExecutionVertices{*}: Number of slow execution vertices at the > moment. > # {*}numEffectiveSpeculativeExecutions{*}: Number of speculative executions > which finish before their corresponding original executions finish. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28402) Create FailureHandlingResultSnapshot with the truly failed execution
[ https://issues.apache.org/jira/browse/FLINK-28402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28402. --- Resolution: Done Done via edd10a1d8fd09f8a5e296ecdf0201945d77d5ff2 > Create FailureHandlingResultSnapshot with the truly failed execution > > > Key: FLINK-28402 > URL: https://issues.apache.org/jira/browse/FLINK-28402 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Previously, FailureHandlingResultSnapshot was always created to treat the > only current attempt of an execution vertex as the failed execution. This is > no longer right in speculative execution cases, in which an execution vertex > can have multiple current executions, and any of them may fail. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28144) Let JobMaster support blocklist mechanism
[ https://issues.apache.org/jira/browse/FLINK-28144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28144. --- Fix Version/s: 1.16.0 Resolution: Done Done via: f2f83e1956eccecaa2371b21bddaf7778bb4f819 04f2f0c2660b312449419a3acb58a46a38d84f64 72ea8b5999bf36125aa5f1a38df4ec52c7a95702 387b2a473d0c0a8d58d1ca0401894dffc0527b31 > Let JobMaster support blocklist mechanism > - > > Key: FLINK-28144 > URL: https://issues.apache.org/jira/browse/FLINK-28144 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Lijie Wang >Assignee: Lijie Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > SlotPool should avoid allocating slots that located on blocked nodes. To do > that, our core idea is to keep the SlotPool in such a state: there is no slot > in SlotPool that is free (no task assigned) and located on blocked nodes. > Details are as following: > 1. When receiving slot offers from task managers located on blocked nodes, > all offers should be rejected. > 2. When a node is newly blocked, we should release all free(no task assigned) > slots on it. We need to find all task managers on blocked nodes and release > all free slots on them by SlotPoolService#releaseFreeSlotsOnTaskManager. > 3. When a slot state changes from reserved(task assigned) to free(no task > assigned), it will check whether the corresponding task manager is blocked. > If yes, release the slot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-28402) Create FailureHandlingResultSnapshot with the truly failed execution
[ https://issues.apache.org/jira/browse/FLINK-28402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu reassigned FLINK-28402: --- Assignee: Zhu Zhu > Create FailureHandlingResultSnapshot with the truly failed execution > > > Key: FLINK-28402 > URL: https://issues.apache.org/jira/browse/FLINK-28402 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0 > > > Previously, FailureHandlingResultSnapshot was always created to treat the > only current attempt of an execution vertex as the failed execution. This is > no longer right in speculative execution cases, in which an execution vertex > can have multiple current executions, and any of them may fail. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-28392) RemoveCachedShuffleDescriptorTest#testRemoveOffloadedCacheForPointwiseEdgeAfterFailover causes fatal error on CI
[ https://issues.apache.org/jira/browse/FLINK-28392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhu Zhu closed FLINK-28392. --- Resolution: Fixed Fixed via e5c4e3f519f364b5951e7cac331eb8af48f0ed84 > RemoveCachedShuffleDescriptorTest#testRemoveOffloadedCacheForPointwiseEdgeAfterFailover > causes fatal error on CI > > > Key: FLINK-28392 > URL: https://issues.apache.org/jira/browse/FLINK-28392 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.16.0 >Reporter: Martijn Visser >Assignee: Zhu Zhu >Priority: Blocker > Labels: pull-request-available > Fix For: 1.16.0 > > > {code:java} > Jul 05 03:30:03 [ERROR] Error occurred in starting fork, check output in log > Jul 05 03:30:03 [ERROR] Process Exit Code: 239 > Jul 05 03:30:03 [ERROR] Crashed tests: > Jul 05 03:30:03 [ERROR] > org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategyTest > Jul 05 03:30:03 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Jul 05 03:30:03 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-runtime && > /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -XX:+UseG1GC -Xms256m -Xmx768m > -jar > /__w/1/s/flink-runtime/target/surefire/surefirebooter4932865857415988980.jar > /__w/1/s/flink-runtime/target/surefire 2022-07-05T03-23-25_404-jvmRun1 > surefire8916732512419442726tmp surefire_2130262314165063415tmp > Jul 05 03:30:03 [ERROR] Error occurred in starting fork, check output in log > Jul 05 03:30:03 [ERROR] Process Exit Code: 239 > Jul 05 03:30:03 [ERROR] Crashed tests: > Jul 05 03:30:03 [ERROR] > org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategyTest > Jul 05 03:30:03 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532) > Jul 05 03:30:03 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:405) > Jul 05 03:30:03 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:321) > Jul 05 03:30:03 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266) > Jul 05 03:30:03 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314) > Jul 05 03:30:03 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37602=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=7b25afdf-cc6c-566f-5459-359dc2585798=8147 -- This message was sent by Atlassian Jira (v8.20.10#820010)