[
https://issues.apache.org/jira/browse/FLINK-35833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17888318#comment-17888318
]
Ferenc Csaky commented on FLINK-35833:
--------------------------------------
Thanks for the analysis [~dylanmei], [~mateczagany]! Skip trying to create the
base parents for the "local://" fetcher, sounds good to me, opened the PR with
the relevant changes.
> ArtifactFetchManager always creates artifact dir
> ------------------------------------------------
>
> Key: FLINK-35833
> URL: https://issues.apache.org/jira/browse/FLINK-35833
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes
> Affects Versions: 1.20.0, 1.19.1
> Reporter: Dylan Meissner
> Assignee: Ferenc Csaky
> Priority: Critical
> Labels: pull-request-available
> Fix For: 2.0.0, 1.19.2, 1.20.1
>
>
> FLINK-28915 added support for remote job jar fetching (HTTPS, S3, etc) but
> broke the default behavior of local jar when running application on
> non-writable filesystems. ArtifactFetchManager always attempts to create an
> artifact directory, even when jar is using "local" protocol.
> Running application on non-writable filesystem is a common scenario in
> environments when jar is published with the Docker container image.
> A local jar has no need to be fetched to an intermediate directory, since
> it's already available on the local filesytem. The LocalArtifactFetcher does
> not write to the filesystem. However, the ArtifactFetchManager always
> attempts to create a directory before fetching, regardless of which fetcher
> would do the work. On non-writable filesystem and environments lacking
> permissions, the outcome is a runtime exception:
> {{java.lang.RuntimeException: org.apache.flink.util.FlinkRuntimeException:
> Failed}}
> {{to create parent(s) for given base dir:}}
> {{/opt/flink/artifacts/<namesapce>/<job name>}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:158)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgramRetriever(KubernetesApplicationClusterEntrypoint.java:129)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgram(KubernetesApplicationClusterEntrypoint.java:111)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.lambda$main$0(KubernetesApplicationClusterEntrypoint.java:85)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.main(KubernetesApplicationClusterEntrypoint.java:85)
> [flink-dist-1.19.1.jar:1.19.1]}}
> {{Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to create
> parent(s) for given base dir:
> /opt/flink/artifacts/app07772/sample-app-flink-1-19}}
> {{ at
> org.apache.flink.client.program.artifact.ArtifactUtils.createMissingParents(ArtifactUtils.java:50)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:123)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ ... 5 more}}
> {{Caused by: java.io.IOException: Cannot create directory
> '/opt/flink/artifacts/<namespace>'.}}
> {{ at org.apache.commons.io.FileUtils.mkdirs(FileUtils.java:2289)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at org.apache.commons.io.FileUtils.forceMkdir(FileUtils.java:1376)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.commons.io.FileUtils.forceMkdirParent(FileUtils.java:1394)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.client.program.artifact.ArtifactUtils.createMissingParents(ArtifactUtils.java:46)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:123)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ at
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156)
> ~[flink-dist-1.19.1.jar:1.19.1]}}
> {{ ... 5 more}}
> A workaround is to always specify a location using configuration that allows
> the process to create directories e.g., user.artifacts.base-dir: /tmp/foo.
> A solution proposal is to enable each fetcher to decide whether to create the
> intermediate directory or fail.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)