Sayat Satybaldiyev created FLINK-10287:
------------------------------------------

             Summary: Flink HA Persist Cancelled Job in Zookeeper
                 Key: FLINK-10287
                 URL: https://issues.apache.org/jira/browse/FLINK-10287
             Project: Flink
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.6.0
            Reporter: Sayat Satybaldiyev
         Attachments: Screenshot from 2018-09-05 16-48-34.png

Flink HA persisted canceled job in Zookeeper, which makes HA mode quite 
fragile. In case JM get restarted, it tries to recover canceled job and after 
some time fails completely being not able to recover it. 

 

How to reproduce:
 # Have Flink HA 1.6 cluster
 # Cancel a running flink job
 # Observe that flink didn't remove ZK metadata.

!Screenshot from 2018-09-05 16-48-34.png!
{code:java}
ls /flink/flink_ns/jobgraphs/46d8d3555936c0d8e6b6ec21cc02bb11
[7f392fd9-cedc-4978-9186-1f54b98eeeb7]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to