[
https://issues.apache.org/jira/browse/FLINK-35695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferenc Csaky updated FLINK-35695:
---------------------------------
Description:
Follow up the test for FLINK-32315.
In Flink 1.20, we introduced a local file upload possibility for Kubernetes
deployments. To verify this feature, you can check the relevant
[PR|https://github.com/apache/flink/pull/24303], which includes the docs, and
examples for more information.
To test this feature, it is required to have an available Kubernetes cluster to
deploy to, and some DFS where Flink can deploy the local JAR. For a sandbox
setup, I recommend to install {{minikube}}. The flink-k8s-operator [quickstart
guide|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/#prerequisites]
explains that pretty well ({{helm}} is not needed here). For the DFS, I have a
gist to setup Minio on a K8s pod
[here|https://gist.github.com/ferenc-csaky/fd7fee71d89cd389cac2da4a4471ab65].
The two following main use-case should be handled correctly:
# Deploy job with a local job JAR, but without further dependencies
{code:bash}
$ ./bin/flink run-application \
--target kubernetes-application \
-Dkubernetes.cluster-id=my-first-application-cluster \
-Dkubernetes.container.image=flink:1.20 \
-Dkubernetes.artifacts.local-upload-enabled=true \
-Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
local:///path/to/TopSpeedWindowing.jar
{code}
# Deploy job with a local job JAR, and further dependencies (e.g. a UDF
included in a separate JAR).
{code:bash}
$ ./bin/flink run-application \
--target kubernetes-application \
-Dkubernetes.cluster-id=my-first-application-cluster \
-Dkubernetes.container.image=flink:1.20 \
-Dkubernetes.artifacts.local-upload-enabled=true \
-Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
-Duser.artifacts.artifact-list=local:///tmp/my-flink-udf1.jar\;s3://my-bucket/my-flink-udf2.jar
\
local:///tmp/my-flink-job.jar
{code}
was:
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533
In Flink 1.20, we proposed integrating Flink's Hybrid Shuffle with Apache
Celeborn through a pluggable remote tier interface. To verify this feature, you
should reference these main two steps.
1. Implement Celeborn tier.
* Implement a new tier factory and tier for Celeborn, including these APIs,
including TierFactory/TierMasterAgent/TierProducerAgent/TierConsumerAgent.
* The implementations should support granular data management at the Segment
level for both client and server sides.
2. Use the implemented tier to shuffle data.
* Compile Flink and Celeborn.
* Deploy Celeborn service
** Deploy a new Celeborn service with the new compiled packages. You can
reference the doc ([https://celeborn.apache.org/docs/latest/]) to deploy the
cluster.
* Add the compiled flink plugin jar (celeborn-client-flink-xxx.jar) to Flink
classpath.
* Configure the options to enable the feature.
** Configure the option
taskmanager.network.hybrid-shuffle.external-remote-tier-factory.class to the
new Celeborn tier classes. Except for this option, the following options should
also be added.
{code:java}
execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_FULL
celeborn.master.endpoints: <the celeborn endpoint address>
celeborn.client.shuffle.partition.type: MAP{code}
* Run some test examples(e.g., WordCount) to verify the feature.
> Release Testing: Verify FLINK-32315: Support local file upload in K8s mode
> --------------------------------------------------------------------------
>
> Key: FLINK-35695
> URL: https://issues.apache.org/jira/browse/FLINK-35695
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Network
> Reporter: Ferenc Csaky
> Priority: Blocker
> Labels: release-testing
> Fix For: 1.20.0
>
>
> Follow up the test for FLINK-32315.
> In Flink 1.20, we introduced a local file upload possibility for Kubernetes
> deployments. To verify this feature, you can check the relevant
> [PR|https://github.com/apache/flink/pull/24303], which includes the docs, and
> examples for more information.
> To test this feature, it is required to have an available Kubernetes cluster
> to deploy to, and some DFS where Flink can deploy the local JAR. For a
> sandbox setup, I recommend to install {{minikube}}. The flink-k8s-operator
> [quickstart
> guide|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/#prerequisites]
> explains that pretty well ({{helm}} is not needed here). For the DFS, I have
> a gist to setup Minio on a K8s pod
> [here|https://gist.github.com/ferenc-csaky/fd7fee71d89cd389cac2da4a4471ab65].
> The two following main use-case should be handled correctly:
> # Deploy job with a local job JAR, but without further dependencies
> {code:bash}
> $ ./bin/flink run-application \
> --target kubernetes-application \
> -Dkubernetes.cluster-id=my-first-application-cluster \
> -Dkubernetes.container.image=flink:1.20 \
> -Dkubernetes.artifacts.local-upload-enabled=true \
> -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
> local:///path/to/TopSpeedWindowing.jar
> {code}
> # Deploy job with a local job JAR, and further dependencies (e.g. a UDF
> included in a separate JAR).
> {code:bash}
> $ ./bin/flink run-application \
> --target kubernetes-application \
> -Dkubernetes.cluster-id=my-first-application-cluster \
> -Dkubernetes.container.image=flink:1.20 \
> -Dkubernetes.artifacts.local-upload-enabled=true \
> -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
>
> -Duser.artifacts.artifact-list=local:///tmp/my-flink-udf1.jar\;s3://my-bucket/my-flink-udf2.jar
> \
> local:///tmp/my-flink-job.jar
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)