[ 
https://issues.apache.org/jira/browse/FLINK-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-22893:
-------------------------------------
    Description: 
The NodeCache used by the LeaderElection-/-RetrievalDrivers ensures that 
parents to the observed node exists by regularly issuing mkdir calls. This 
operation can fail if concurrently the HA data is being cleaned up, which 
results in curator throwing an unhandled exception which crashes the TM.

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382

  
was:https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382


> Leader retrieval fails with NoNodeException
> -------------------------------------------
>
>                 Key: FLINK-22893
>                 URL: https://issues.apache.org/jira/browse/FLINK-22893
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.11.1, 1.14.0
>            Reporter: Dawid Wysakowicz
>            Assignee: Chesnay Schepler
>            Priority: Critical
>              Labels: pull-request-available, test-stability
>             Fix For: 1.14.0
>
>
> The NodeCache used by the LeaderElection-/-RetrievalDrivers ensures that 
> parents to the observed node exists by regularly issuing mkdir calls. This 
> operation can fail if concurrently the HA data is being cleaned up, which 
> results in curator throwing an unhandled exception which crashes the TM.
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18700&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=2c7d57b9-7341-5a87-c9af-2cf7cc1a37dc&l=4382



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to