[
https://issues.apache.org/jira/browse/FLINK-32779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuxin Tan updated FLINK-32779:
------------------------------
Description:
This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.
This verification mainly contains two parts.
Part 1. Run without remote storage.
This part mainly is to verify the new mode can use the Memory tier and Disk
tier dynamically when shuffling.
Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode:
ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds
q1.sql). When the resource is enough, then the upstream and the downstream can
run at the same time.
Part2. Run with remote storage.
This part mainly is to verify the new mode can use the Memory tier, Disk tier,
Remote tier dynamically when shuffling.
2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode:
ALL_EXCHANGES_HYBRID_SELECTIVE)
2.2 set the remote storage path with the
option(taskmanager.network.hybrid-shuffle.remote.path:
oss://flink-runtime/runtime/shuffle, note that the path
oss://flink-runtime/runtime/shuffle in oss should be exist).
2.3 Modify the
option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to
1, compile the package, then run a simple job. For example(tpcds q1.sql).
Check the shuffle data is written to the remote storage in the path
oss://flink-runtime/runtime/shuffle.
> Release Testing: Verify FLIP-301: Hybrid Shuffle supports Remote Storage
> ------------------------------------------------------------------------
>
> Key: FLINK-32779
> URL: https://issues.apache.org/jira/browse/FLINK-32779
> Project: Flink
> Issue Type: Sub-task
> Components: Tests
> Affects Versions: 1.18.0
> Reporter: Qingsheng Ren
> Priority: Major
> Fix For: 1.18.0
>
>
> This ticket aims to verify https://issues.apache.org/jira/browse/FLINK-31634.
> This verification mainly contains two parts.
> Part 1. Run without remote storage.
> This part mainly is to verify the new mode can use the Memory tier and Disk
> tier dynamically when shuffling.
> Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode:
> ALL_EXCHANGES_HYBRID_SELECTIVE), and run a simple job. For example(tpcds
> q1.sql). When the resource is enough, then the upstream and the downstream
> can run at the same time.
> Part2. Run with remote storage.
> This part mainly is to verify the new mode can use the Memory tier, Disk
> tier, Remote tier dynamically when shuffling.
> 2.1 Set the mode to new hybrid shuffle mode(execution.batch-shuffle-mode:
> ALL_EXCHANGES_HYBRID_SELECTIVE)
> 2.2 set the remote storage path with the
> option(taskmanager.network.hybrid-shuffle.remote.path:
> oss://flink-runtime/runtime/shuffle, note that the path
> oss://flink-runtime/runtime/shuffle in oss should be exist).
> 2.3 Modify the
> option TieredStorageConfiguration#DEFAULT_MIN_RESERVE_DISK_SPACE_FRACTION to
> 1, compile the package, then run a simple job. For example(tpcds q1.sql).
> Check the shuffle data is written to the remote storage in the path
> oss://flink-runtime/runtime/shuffle.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)