[
https://issues.apache.org/jira/browse/FLINK-20935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263861#comment-17263861
]
yuemeng commented on FLINK-20935:
---------------------------------
[~fly_in_gis],yes ,i set fs.default-scheme to hdfs, but create tmp file to
write flink configuration and add it to local resource don‘t need rely on
this configuration just like the way we handle jobGraph.
{code}
// write job graph to tmp file and add it to local resource
// TODO: server use user main method to generate job graph
if (jobGraph != null) {
File tmpJobGraphFile = null;
try {
tmpJobGraphFile = File.createTempFile(appId.toString(), null);
try (FileOutputStream output = new FileOutputStream(tmpJobGraphFile);
ObjectOutputStream obOutput = new ObjectOutputStream(output)) {
obOutput.writeObject(jobGraph);
}
final String jobGraphFilename = "job.graph";
configuration.setString(JOB_GRAPH_FILE_PATH, jobGraphFilename);
fileUploader.registerSingleLocalResource(
jobGraphFilename,
new Path(tmpJobGraphFile.toURI()),
"",
LocalResourceType.FILE,
true,
false);
classPathBuilder.append(jobGraphFilename).append(File.pathSeparator);
} catch (Exception e) {
LOG.warn("Add job graph to local resource fail.");
throw e;
} finally {
if (tmpJobGraphFile != null && !tmpJobGraphFile.delete()) {
LOG.warn("Fail to delete temporary file {}.", tmpJobGraphFile.toPath());
}
}
}{code}
> can't write flink configuration to tmp file and add it to local resource in
> yarn session mode
> ---------------------------------------------------------------------------------------------
>
> Key: FLINK-20935
> URL: https://issues.apache.org/jira/browse/FLINK-20935
> Project: Flink
> Issue Type: Bug
> Components: Deployment / YARN
> Affects Versions: 1.12.0, 1.13.0
> Reporter: yuemeng
> Priority: Major
> Labels: pull-request-available
>
> In flink 1.12.0 or lastest version,when we execute command such as
> bin/yarn-session.sh -n 20 -jm 9096 -nm 4096 -st,the depoy will be failed with
> follow errors:
> {code}
> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't
> deploy Yarn session cluster
> at
> org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:411)
> at
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:498)
> at
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:730)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
> at
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> at
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:730)
> Caused by: java.io.FileNotFoundException: File does not exist:
> /tmp/application_1573723355201_0036-flink-conf.yaml688141408443326132.tmp
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
> {code}
> when we called startAppMaster method in YarnClusterDescriptor,it will be try
> to write flink configuration to tmp file and add it to local resource. but
> the follow code will make the tmp file system as a distribute file system
> {code}
> // Upload the flink configuration
> // write out configuration file
> File tmpConfigurationFile = null;
> try {
> tmpConfigurationFile = File.createTempFile(appId +
> "-flink-conf.yaml", null);
> BootstrapTools.writeConfiguration(configuration,
> tmpConfigurationFile);
> String flinkConfigKey = "flink-conf.yaml";
> fileUploader.registerSingleLocalResource(
> flinkConfigKey,
> new
> Path(tmpConfigurationFile.getAbsolutePath()),
> "",
> LocalResourceType.FILE,
> true,
> true);
>
> classPathBuilder.append("flink-conf.yaml").append(File.pathSeparator);
> } finally {
> if (tmpConfigurationFile != null &&
> !tmpConfigurationFile.delete()) {
> LOG.warn("Fail to delete temporary file {}.",
> tmpConfigurationFile.toPath());
> }
> }
> {code}
> {code} tmpConfigurationFile.getAbsolutePath() {code} method will be return a
> path without file schema and the file system will be considered as a
> distribute file system
--
This message was sent by Atlassian Jira
(v8.3.4#803005)