[ 
https://issues.apache.org/jira/browse/FLINK-35695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferenc Csaky updated FLINK-35695:
---------------------------------
    Description: 
Follow up the test for FLINK-32315.

In Flink 1.20, we introduced a local file upload possibility for Kubernetes 
deployments. To verify this feature, you can check the relevant 
[PR|https://github.com/apache/flink/pull/24303], which includes the docs, and 
examples for more information.

To test this feature, it is required to have an available Kubernetes cluster to 
deploy to, and some DFS where Flink can deploy the local JAR. For a sandbox 
setup, I recommend to install {{minikube}}. The flink-k8s-operator [quickstart 
guide|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/#prerequisites]
 explains that pretty well ({{helm}} is not needed here). For the DFS, I have a 
gist to setup Minio on a K8s pod 
[here|https://gist.github.com/ferenc-csaky/fd7fee71d89cd389cac2da4a4471ab65].

The two following main use-case should be handled correctly:

# Deploy job with a local job JAR, but without further dependencies
{code:bash}
$ ./bin/flink run-application \
    --target kubernetes-application \
    -Dkubernetes.cluster-id=my-first-application-cluster \
    -Dkubernetes.container.image=flink:1.20 \
    -Dkubernetes.artifacts.local-upload-enabled=true \
    -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
    local:///path/to/TopSpeedWindowing.jar
{code}
# Deploy job with a local job JAR, and further dependencies (e.g. a UDF 
included in a separate JAR).
{code:bash}
$ ./bin/flink run-application \
    --target kubernetes-application \
    -Dkubernetes.cluster-id=my-first-application-cluster \
    -Dkubernetes.container.image=flink:1.20 \
    -Dkubernetes.artifacts.local-upload-enabled=true \
    -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
    
-Duser.artifacts.artifact-list=local:///tmp/my-flink-udf1.jar\;s3://my-bucket/my-flink-udf2.jar
 \
    local:///tmp/my-flink-job.jar
{code}

  was:
Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533

In Flink 1.20,  we proposed integrating Flink's Hybrid Shuffle with Apache 
Celeborn through a pluggable remote tier interface. To verify this feature, you 
should reference these main two steps.

1. Implement Celeborn tier.
 * Implement a new tier factory and tier for Celeborn, including these APIs, 
including TierFactory/TierMasterAgent/TierProducerAgent/TierConsumerAgent.
 * The implementations should support granular data management at the Segment 
level for both client and server sides.

2. Use the implemented tier to shuffle data.
 * Compile Flink and Celeborn.
 * Deploy Celeborn service
 ** Deploy a new Celeborn service with the new compiled packages. You can 
reference the doc ([https://celeborn.apache.org/docs/latest/]) to deploy the 
cluster.
 * Add the compiled flink plugin jar (celeborn-client-flink-xxx.jar) to Flink 
classpath.
 * Configure the options to enable the feature.
 ** Configure the option 
taskmanager.network.hybrid-shuffle.external-remote-tier-factory.class to the 
new Celeborn tier classes. Except for this option, the following options should 
also be added.

{code:java}
execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_FULL 
celeborn.master.endpoints: <the celeborn endpoint address>
celeborn.client.shuffle.partition.type: MAP{code}
 * Run some test examples(e.g., WordCount) to verify the feature.

 


> Release Testing: Verify FLINK-32315: Support local file upload in K8s mode
> --------------------------------------------------------------------------
>
>                 Key: FLINK-35695
>                 URL: https://issues.apache.org/jira/browse/FLINK-35695
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Network
>            Reporter: Ferenc Csaky
>            Priority: Blocker
>              Labels: release-testing
>             Fix For: 1.20.0
>
>
> Follow up the test for FLINK-32315.
> In Flink 1.20, we introduced a local file upload possibility for Kubernetes 
> deployments. To verify this feature, you can check the relevant 
> [PR|https://github.com/apache/flink/pull/24303], which includes the docs, and 
> examples for more information.
> To test this feature, it is required to have an available Kubernetes cluster 
> to deploy to, and some DFS where Flink can deploy the local JAR. For a 
> sandbox setup, I recommend to install {{minikube}}. The flink-k8s-operator 
> [quickstart 
> guide|https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/#prerequisites]
>  explains that pretty well ({{helm}} is not needed here). For the DFS, I have 
> a gist to setup Minio on a K8s pod 
> [here|https://gist.github.com/ferenc-csaky/fd7fee71d89cd389cac2da4a4471ab65].
> The two following main use-case should be handled correctly:
> # Deploy job with a local job JAR, but without further dependencies
> {code:bash}
> $ ./bin/flink run-application \
>     --target kubernetes-application \
>     -Dkubernetes.cluster-id=my-first-application-cluster \
>     -Dkubernetes.container.image=flink:1.20 \
>     -Dkubernetes.artifacts.local-upload-enabled=true \
>     -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
>     local:///path/to/TopSpeedWindowing.jar
> {code}
> # Deploy job with a local job JAR, and further dependencies (e.g. a UDF 
> included in a separate JAR).
> {code:bash}
> $ ./bin/flink run-application \
>     --target kubernetes-application \
>     -Dkubernetes.cluster-id=my-first-application-cluster \
>     -Dkubernetes.container.image=flink:1.20 \
>     -Dkubernetes.artifacts.local-upload-enabled=true \
>     -Dkubernetes.artifacts.local-upload-target=s3://my-bucket/ \
>     
> -Duser.artifacts.artifact-list=local:///tmp/my-flink-udf1.jar\;s3://my-bucket/my-flink-udf2.jar
>  \
>     local:///tmp/my-flink-job.jar
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to