[
https://issues.apache.org/jira/browse/FLINK-28980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598422#comment-17598422
]
Biao Liu commented on FLINK-28980:
----------------------------------
I've tested the scenario, and it looks good to me.
I started two TMs in different machines. And each of them has two slots. I used
"hostname checking" with "InetAddress.getLocalHost().getHostName()" to make one
task much slower than others (there are three subtasks of this operator). I set
the "slow-task-detector.execution-time.baseline-ratio" to 0.5. The speculative
task is launched as expected. I checked the web UI, metrics, logs and produced
result. Everything works fine. There are some screenshots and log files in
attachments.
> Release Testing: Verify FLIP-168 speculative execution
> ------------------------------------------------------
>
> Key: FLINK-28980
> URL: https://issues.apache.org/jira/browse/FLINK-28980
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Reporter: Zhu Zhu
> Assignee: Biao Liu
> Priority: Blocker
> Labels: release-testing
> Fix For: 1.16.0
>
>
> Speculative execution is introduced in Flink 1.16 to deal with temporary slow
> tasks caused by slow nodes. This feature currently consists of 4 FLIPs:
> - FLIP-168: Speculative Execution core part
> - FLIP-224: Blocklist Mechanism
> - FLIP-245: Source Supports Speculative Execution
> - FLIP-249: Flink Web UI Enhancement for Speculative Execution
> This ticket aims for verifying FLIP-168, along with FLIP-224 and FLIP-249.
> More details about this feature and how to use it can be found in this
> [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/speculative_execution/].
> To do the verification, the process can be:
> - Write a Flink job which has a subtask running much slower than others
> (e.g. sleep indefinitely if it runs on a certain host, the hostname can be
> retrieved via InetAddress.getLocalHost().getHostName(), or if its
> (subtaskIndex + attemptNumer) % 2 == 0)
> - Modify Flink configuration file to enable speculative execution and tune
> the configuration as you like
> - Submit the job. Checking the web UI, logs, metrics and produced result.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)