[
https://issues.apache.org/jira/browse/FLINK-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Metzger resolved FLINK-2990.
-----------------------------------
Resolution: Fixed
Fix Version/s: 1.0
0.10
Fixed for 0.10 in http://git-wip-us.apache.org/repos/asf/flink/commit/57166592
Fixed for 1.0 in master in
http://git-wip-us.apache.org/repos/asf/flink/commit/ccf4ebdd
> Scala 2.11 build fails to start on YARN
> ---------------------------------------
>
> Key: FLINK-2990
> URL: https://issues.apache.org/jira/browse/FLINK-2990
> Project: Flink
> Issue Type: Bug
> Components: Build System, YARN Client
> Affects Versions: 0.10, 1.0
> Reporter: Robert Metzger
> Assignee: Robert Metzger
> Fix For: 0.10, 1.0
>
>
> Deploying the scala 2.11 build of Flink on YARN seems to fail
> {code}
> robert@hn0-apache:~/flink010-hd22-scala211/flink-0.10.0$
> ./bin/yarn-session.sh -n 2
> 16:36:32,484 WARN org.apache.hadoop.util.NativeCodeLoader
> - Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 16:36:32,748 INFO org.apache.flink.yarn.FlinkYarnClient
> - Using values:
> 16:36:32,750 INFO org.apache.flink.yarn.FlinkYarnClient
> - TaskManager count = 2
> 16:36:32,750 INFO org.apache.flink.yarn.FlinkYarnClient
> - JobManager memory = 1024
> 16:36:32,750 INFO org.apache.flink.yarn.FlinkYarnClient
> - TaskManager memory = 1024
> 16:36:32,874 INFO
> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing
> over to rm2
> 16:36:32,930 WARN org.apache.flink.yarn.FlinkYarnClient
> - The JobManager or TaskManager memory is below the smallest possible YARN
> Container size. The value of 'yarn.scheduler.minimum-allocation-mb' is
> '1536'. Please increase the memory size.YARN will allocate the smaller
> containers but the scheduler will account for the minimum-allocation-mb,
> maybe not all instances you requested will start.
> 16:36:33,448 WARN org.apache.hadoop.hdfs.BlockReaderLocal
> - The short-circuit local reads feature cannot be used because libhadoop
> cannot be loaded.
> 16:36:33,489 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/lib/flink-distabc.jar
> to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-distabc.jar
> 16:36:35,367 INFO org.apache.flink.yarn.Utils
> - Copying from
> /home/robert/flink010-hd22-scala211/flink-0.10.0/conf/flink-conf.yaml to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-conf.yaml
> 16:36:35,695 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/lib/flink-python_2.11-0.10.0.jar
> to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-python_2.11-0.10.0.jar
> 16:36:35,882 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/lib/flink-distabc.jar
> to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-distabc.jar
> 16:36:37,522 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/lib/slf4j-log4j12-1.7.7.jar
> to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/slf4j-log4j12-1.7.7.jar
> 16:36:37,740 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/lib/log4j-1.2.17.jar to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/log4j-1.2.17.jar
> 16:36:37,960 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/conf/logback.xml to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/logback.xml
> 16:36:38,397 INFO org.apache.flink.yarn.Utils
> - Copying from
> file:/home/robert/flink010-hd22-scala211/flink-0.10.0/conf/log4j.properties
> to
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/log4j.properties
> 16:36:38,840 INFO org.apache.flink.yarn.FlinkYarnClient
> - Submitting application master application_1447063737177_0017
> 16:36:39,081 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl
> - Submitted application application_1447063737177_0017
> 16:36:39,081 INFO org.apache.flink.yarn.FlinkYarnClient
> - Waiting for the cluster to be allocated
> 16:36:39,084 INFO org.apache.flink.yarn.FlinkYarnClient
> - Deploying cluster, current state ACCEPTED
> 16:36:40,086 INFO org.apache.flink.yarn.FlinkYarnClient
> - Deploying cluster, current state ACCEPTED
> Error while deploying YARN cluster: The YARN application unexpectedly
> switched to state FAILED during deployment.
> Diagnostics from YARN: Application application_1447063737177_0017 failed 1
> times due to AM Container for appattempt_1447063737177_0017_000001 exited
> with exitCode: -1000
> For more detailed output, check application tracking
> page:http://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8088/proxy/application_1447063737177_0017/Then,
> click on links to logs of each attempt.
> Diagnostics: Resource
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-distabc.jar
> changed on src filesystem (expected 1447086995336, was 1447086997508
> java.io.IOException: Resource
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-distabc.jar
> changed on src filesystem (expected 1447086995336, was 1447086997508
> at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
> at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Failing this attempt. Failing the application.
> If log aggregation is enabled on your cluster, use this command to further
> investigate the issue:
> yarn logs -applicationId application_1447063737177_0017
> org.apache.flink.yarn.FlinkYarnClientBase$YarnDeploymentException: The YARN
> application unexpectedly switched to state FAILED during deployment.
> Diagnostics from YARN: Application application_1447063737177_0017 failed 1
> times due to AM Container for appattempt_1447063737177_0017_000001 exited
> with exitCode: -1000
> For more detailed output, check application tracking
> page:http://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8088/proxy/application_1447063737177_0017/Then,
> click on links to logs of each attempt.
> Diagnostics: Resource
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-distabc.jar
> changed on src filesystem (expected 1447086995336, was 1447086997508
> java.io.IOException: Resource
> hdfs://hn1-apache.vbkocrowebre3dyigxo55soqnb.ax.internal.cloudapp.net:8020/user/robert/.flink/application_1447063737177_0017/flink-distabc.jar
> changed on src filesystem (expected 1447086995336, was 1447086997508
> at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
> at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Failing this attempt. Failing the application.
> If log aggregation is enabled on your cluster, use this command to further
> investigate the issue:
> yarn logs -applicationId application_1447063737177_0017
> at
> org.apache.flink.yarn.FlinkYarnClientBase.deployInternal(FlinkYarnClientBase.java:646)
> at
> org.apache.flink.yarn.FlinkYarnClientBase.deploy(FlinkYarnClientBase.java:338)
> at
> org.apache.flink.client.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:409)
> at
> org.apache.flink.client.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:351)
> {code}
> The problem is that the flink-dist.jar is uploaded twice to HDFS, overwriting
> the timestamp. When the YARN container gets allocated, the timestamps
> mismatch and YARN rejects the JAR file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)