[ https://issues.apache.org/jira/browse/FLINK-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440927#comment-16440927 ]
Gary Yao edited comment on FLINK-8900 at 4/17/18 2:37 PM: ---------------------------------------------------------- When submitting in non-detached mode, the problem still surfaces. It detached mode the status is set correctly. Command used to submit: {noformat} HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster -yjm 2048 -ytm 2048 ./examples/streaming/WordCount.jar {noformat} State and FinalStatus is: KILLED I re-opened the ticket. was (Author: gjy): When submitting in non-detached mode, the problem still surfaces. It detached mode the status is set correctly. Command used to submit: {noformat} HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster -yjm 2048 -ytm 2048 ./examples/streaming/WordCount.jar {noformat} State and FinalStatus is: KILLED > YARN FinalStatus always shows as KILLED with Flip-6 > --------------------------------------------------- > > Key: FLINK-8900 > URL: https://issues.apache.org/jira/browse/FLINK-8900 > Project: Flink > Issue Type: Bug > Components: YARN > Affects Versions: 1.5.0, 1.6.0 > Reporter: Nico Kruber > Assignee: Till Rohrmann > Priority: Blocker > Labels: flip-6 > Fix For: 1.5.0 > > > Whenever I run a simple simple word count like this one on YARN with Flip-6 > enabled, > {code} > ./bin/flink run -m yarn-cluster -yjm 768 -ytm 3072 -ys 2 -p 20 -c > org.apache.flink.streaming.examples.wordcount.WordCount > ./examples/streaming/WordCount.jar --input /usr/share/doc/rsync-3.0.6/COPYING > {code} > it will show up as {{KILLED}} in the {{State}} and {{FinalStatus}} columns > even though the program ran successfully like this one (irrespective of > FLINK-8899 occurring or not): > {code} > 2018-03-08 16:48:39,049 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Streaming > WordCount (11a794d2f5dc2955d8015625ec300c20) switched from state RUNNING to > FINISHED. > 2018-03-08 16:48:39,050 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping > checkpoint coordinator for job 11a794d2f5dc2955d8015625ec300c20 > 2018-03-08 16:48:39,050 INFO > org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore - > Shutting down > 2018-03-08 16:48:39,078 INFO > org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Job > 11a794d2f5dc2955d8015625ec300c20 reached globally terminal state FINISHED. > 2018-03-08 16:48:39,151 INFO > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register > TaskManager e58efd886429e8f080815ea74ddfa734 at the SlotManager. > 2018-03-08 16:48:39,221 INFO org.apache.flink.runtime.jobmaster.JobMaster > - Stopping the JobMaster for job Streaming > WordCount(11a794d2f5dc2955d8015625ec300c20). > 2018-03-08 16:48:39,270 INFO org.apache.flink.runtime.jobmaster.JobMaster > - Close ResourceManager connection > 43f725adaee14987d3ff99380701f52f: JobManager is shutting down.. > 2018-03-08 16:48:39,270 INFO org.apache.flink.yarn.YarnResourceManager > - Disconnect job manager > 00000000000000000000000000000...@akka.tcp://fl...@ip-172-31-7-0.eu-west-1.compute.internal:34281/user/jobmanager_0 > for job 11a794d2f5dc2955d8015625ec300c20 from the resource manager. > 2018-03-08 16:48:39,349 INFO > org.apache.flink.runtime.jobmaster.slotpool.SlotPool - Suspending > SlotPool. > 2018-03-08 16:48:39,349 INFO > org.apache.flink.runtime.jobmaster.slotpool.SlotPool - Stopping > SlotPool. > 2018-03-08 16:48:39,349 INFO > org.apache.flink.runtime.jobmaster.JobManagerRunner - > JobManagerRunner already shutdown. > 2018-03-08 16:48:39,775 INFO > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register > TaskManager 4e1fb6c8f95685e24b6a4cb4b71ffb92 at the SlotManager. > 2018-03-08 16:48:39,846 INFO > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register > TaskManager b5bce0bdfa7fbb0f4a0905cc3ee1c233 at the SlotManager. > 2018-03-08 16:48:39,876 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. > 2018-03-08 16:48:39,910 INFO > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register > TaskManager a35b0690fdc6ec38bbcbe18a965000fd at the SlotManager. > 2018-03-08 16:48:39,942 INFO > org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register > TaskManager 5175cabe428bea19230ac056ff2a17bb at the SlotManager. > 2018-03-08 16:48:39,974 INFO org.apache.flink.runtime.blob.BlobServer > - Stopped BLOB server at 0.0.0.0:46511 > 2018-03-08 16:48:39,975 INFO > org.apache.flink.runtime.blob.TransientBlobCache - Shutting down > BLOB cache > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)