[ https://issues.apache.org/jira/browse/FLINK-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383469#comment-16383469 ]
Nico Kruber edited comment on FLINK-8519 at 3/2/18 11:07 AM: ------------------------------------------------------------- [~yew1eb], since I don't have your {{hadoop_user_login.sh}} script, I tried to reproduce the problem with the following command on AWS with Flink release-1.5 checkout of revision 347ec3848ec603ac452e394a5211cf888db6663f. I did not see any exception in the logs or the command line outputs. {code} ./bin/yarn-session.sh -n 2 -nm job_name -jm 1024 -tm 4096 -s 4 -d {code} Can you try this command as well? Also, with your command (executed in a {{bash}} shell) and the given {{yarn.container-start-command-template}}, I got the following error: {code} Error: Could not find or load main class %-Xmx424m% {code} which indicates that the template may be wrong. Based on {{org.apache.flink.configuration.ConfigConstants#DEFAULT_YARN_CONTAINER_START_COMMAND_TEMPLATE}}, you should only use one percent sign. was (Author: nicok): [~yew1eb], since I don't have your {{hadoop_user_login.sh}} script, I tried to reproduce the problem with the following command on AWS and did not see any exception in the logs or the command line outputs. {code} ./bin/yarn-session.sh -n 2 -nm job_name -jm 1024 -tm 4096 -s 4 -d {code} Can you try this command as well? Also, with your command (executed in a {{bash}} shell) and the given {{yarn.container-start-command-template}}, I got the following error: {code} Error: Could not find or load main class %-Xmx424m% {code} which indicates that the template may be wrong. Based on {{org.apache.flink.configuration.ConfigConstants#DEFAULT_YARN_CONTAINER_START_COMMAND_TEMPLATE}}, you should only use one percent sign. > FileAlreadyExistsException on Start Flink Session > -------------------------------------------------- > > Key: FLINK-8519 > URL: https://issues.apache.org/jira/browse/FLINK-8519 > Project: Flink > Issue Type: Bug > Components: YARN > Affects Versions: 1.5.0 > Reporter: Hai Zhou UTC+8 > Priority: Blocker > Fix For: 1.5.0 > > > *steps to reproduce:* > 1. build flink from source , git commit: c1734f4 > 2. run script: > source /path/hadoop/bin/hadoop_user_login.sh hadoop-launcher; > export YARN_CONF_DIR=/path/hadoop/etc/hadoop; > export HADOOP_CONF_DIR=/path/hadoop/etc/hadoop; > export JVM_ARGS="-Djava.security.krb5.conf=${HADOOP_CONF_DIR}/krb5.conf"; > /path/flink-1.5-SNAPSHOT/bin/yarn-session.sh -D > yarn.container-start-command-template="/usr/local/jdk1.8.0_112/bin/java > %%jvmmem%% %%jvmopts%% %%logging%% %%class%% %%args%% %%redirects%%" -n 4 -nm > job_name -qu root.rt.flink -jm 1024 -tm 4096 -s 4 -d > > *error infos:* > 2018-01-27 00:51:12,841 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli - > Error while running the Flink Yarn session. > java.lang.reflect.UndeclaredThrowableException > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1571) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at > org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:786) > Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: > Couldn't deploy Yarn session cluster > at > org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:389) > at > org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:594) > at > org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:786) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) > ... 2 more > Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: Path /user > already exists as dir; cannot create link here > at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:244) > at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:334) > at > org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:161) > at > org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:161) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) > at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167) > at > org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:656) > at > org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:485) > at > org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:384) > ... 7 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)