[ 
https://issues.apache.org/jira/browse/FLINK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-8559.
-----------------------------------
    Resolution: Fixed

master: dbb81acb5a1d0f2a9521c6eef7eeb2436bb8004d

> Exceptions in RocksDBIncrementalSnapshotOperation#takeSnapshot cause job to 
> get stuck
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-8559
>                 URL: https://issues.apache.org/jira/browse/FLINK-8559
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing, Tests
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>            Priority: Blocker
>             Fix For: 1.5.0, 1.4.1
>
>
> In the {{RocksDBKeyedStatebackend#snapshotIncrementally}} we can find this 
> code
>  
> {code:java}
> final RocksDBIncrementalSnapshotOperation<K> snapshotOperation =
>       new RocksDBIncrementalSnapshotOperation<>(
>               this,
>               checkpointStreamFactory,
>               checkpointId,
>               checkpointTimestamp);
> snapshotOperation.takeSnapshot();
> return new FutureTask<KeyedStateHandle>(
>       new Callable<KeyedStateHandle>() {
>               @Override
>               public KeyedStateHandle call() throws Exception {
>                       return snapshotOperation.materializeSnapshot();
>               }
>       }
> ) {
>       @Override
>       public boolean cancel(boolean mayInterruptIfRunning) {
>               snapshotOperation.stop();
>               return super.cancel(mayInterruptIfRunning);
>       }
>       @Override
>       protected void done() {
>               snapshotOperation.releaseResources(isCancelled());
>       }
> };
> {code}
> In the constructor of RocksDBIncrementalSnapshotOperation we call 
> {{aquireResource()}} on the RocksDB {{ResourceGuard}}. If 
> {{snapshotOperation.takeSnapshot()}} fails with an exception these resources 
> are never released. When the task is shutdown due to the exception it will 
> get stuck on releasing RocksDB.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to