[ 
https://issues.apache.org/jira/browse/FLINK-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440927#comment-16440927
 ] 

Gary Yao edited comment on FLINK-8900 at 4/17/18 2:37 PM:
----------------------------------------------------------

When submitting in non-detached mode, the problem still surfaces. It detached 
mode the status is set correctly.

Command used to submit:
{noformat}
HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster -yjm 2048 
-ytm 2048  ./examples/streaming/WordCount.jar
{noformat}

State and FinalStatus is: KILLED

I re-opened the ticket.




was (Author: gjy):
When submitting in non-detached mode, the problem still surfaces. It detached 
mode the status is set correctly.

Command used to submit:
{noformat}
HADOOP_CLASSPATH=`hadoop classpath` bin/flink run -m yarn-cluster -yjm 2048 
-ytm 2048  ./examples/streaming/WordCount.jar
{noformat}

State and FinalStatus is: KILLED



> YARN FinalStatus always shows as KILLED with Flip-6
> ---------------------------------------------------
>
>                 Key: FLINK-8900
>                 URL: https://issues.apache.org/jira/browse/FLINK-8900
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Nico Kruber
>            Assignee: Till Rohrmann
>            Priority: Blocker
>              Labels: flip-6
>             Fix For: 1.5.0
>
>
> Whenever I run a simple simple word count like this one on YARN with Flip-6 
> enabled,
> {code}
> ./bin/flink run -m yarn-cluster -yjm 768 -ytm 3072 -ys 2 -p 20 -c 
> org.apache.flink.streaming.examples.wordcount.WordCount 
> ./examples/streaming/WordCount.jar --input /usr/share/doc/rsync-3.0.6/COPYING
> {code}
> it will show up as {{KILLED}} in the {{State}} and {{FinalStatus}} columns 
> even though the program ran successfully like this one (irrespective of 
> FLINK-8899 occurring or not):
> {code}
> 2018-03-08 16:48:39,049 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job Streaming 
> WordCount (11a794d2f5dc2955d8015625ec300c20) switched from state RUNNING to 
> FINISHED.
> 2018-03-08 16:48:39,050 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping 
> checkpoint coordinator for job 11a794d2f5dc2955d8015625ec300c20
> 2018-03-08 16:48:39,050 INFO  
> org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - 
> Shutting down
> 2018-03-08 16:48:39,078 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job 
> 11a794d2f5dc2955d8015625ec300c20 reached globally terminal state FINISHED.
> 2018-03-08 16:48:39,151 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager e58efd886429e8f080815ea74ddfa734 at the SlotManager.
> 2018-03-08 16:48:39,221 INFO  org.apache.flink.runtime.jobmaster.JobMaster    
>               - Stopping the JobMaster for job Streaming 
> WordCount(11a794d2f5dc2955d8015625ec300c20).
> 2018-03-08 16:48:39,270 INFO  org.apache.flink.runtime.jobmaster.JobMaster    
>               - Close ResourceManager connection 
> 43f725adaee14987d3ff99380701f52f: JobManager is shutting down..
> 2018-03-08 16:48:39,270 INFO  org.apache.flink.yarn.YarnResourceManager       
>               - Disconnect job manager 
> 00000000000000000000000000000...@akka.tcp://fl...@ip-172-31-7-0.eu-west-1.compute.internal:34281/user/jobmanager_0
>  for job 11a794d2f5dc2955d8015625ec300c20 from the resource manager.
> 2018-03-08 16:48:39,349 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Suspending 
> SlotPool.
> 2018-03-08 16:48:39,349 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Stopping 
> SlotPool.
> 2018-03-08 16:48:39,349 INFO  
> org.apache.flink.runtime.jobmaster.JobManagerRunner           - 
> JobManagerRunner already shutdown.
> 2018-03-08 16:48:39,775 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager 4e1fb6c8f95685e24b6a4cb4b71ffb92 at the SlotManager.
> 2018-03-08 16:48:39,846 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager b5bce0bdfa7fbb0f4a0905cc3ee1c233 at the SlotManager.
> 2018-03-08 16:48:39,876 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - RECEIVED 
> SIGNAL 15: SIGTERM. Shutting down as requested.
> 2018-03-08 16:48:39,910 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager a35b0690fdc6ec38bbcbe18a965000fd at the SlotManager.
> 2018-03-08 16:48:39,942 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager 5175cabe428bea19230ac056ff2a17bb at the SlotManager.
> 2018-03-08 16:48:39,974 INFO  org.apache.flink.runtime.blob.BlobServer        
>               - Stopped BLOB server at 0.0.0.0:46511
> 2018-03-08 16:48:39,975 INFO  
> org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down 
> BLOB cache
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to