[jira] [Updated] (FLINK-6505) Proactively cleanup local FS for RocksDBKeyedStateBackend on startup

Stefan Richter (JIRA) Tue, 09 May 2017 02:01:04 -0700

     [ 
https://issues.apache.org/jira/browse/FLINK-6505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stefan Richter updated FLINK-6505:
----------------------------------
    Description: 
In {{RocksDBKeyedStateBackend}}, the {{instanceBasePath}} is cleared on 
{{dispose()}}. I think it might make sense to also clear this directory when 
the backend is created, in case something crashed and the backend never reached 
{{dispose()}}. At least for previous runs of the same job, we can know what to 
delete on restart. 

In general, it is very important for this backend to clean up the local FS, 
because the local quota might be very limited compared to the DFS. And a node 
that runs out of local disk space can bring down the whole job, with no way to 
recover (it might always get rescheduled to that node).

  was:
In `RocksDBKeyedStateBackend`, the `instanceBasePath` is cleared on 
`dispose()`. I think it might make sense to also clear this directory when the 
backend is created, in case something crashed and the backend never reached 
`dispose()`. 

In general, it is very important for this backend to clean up the local FS, 
because the local quota might be very limited compared to the DFS. And a node 
that runs out of local disk space can bring down the whole job, with no way to 
recover (it might always get rescheduled to that node).


> Proactively cleanup local FS for RocksDBKeyedStateBackend on startup
> --------------------------------------------------------------------
>
>                 Key: FLINK-6505
>                 URL: https://issues.apache.org/jira/browse/FLINK-6505
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.0
>            Reporter: Stefan Richter
>
> In {{RocksDBKeyedStateBackend}}, the {{instanceBasePath}} is cleared on 
> {{dispose()}}. I think it might make sense to also clear this directory when 
> the backend is created, in case something crashed and the backend never 
> reached {{dispose()}}. At least for previous runs of the same job, we can 
> know what to delete on restart. 
> In general, it is very important for this backend to clean up the local FS, 
> because the local quota might be very limited compared to the DFS. And a node 
> that runs out of local disk space can bring down the whole job, with no way 
> to recover (it might always get rescheduled to that node).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (FLINK-6505) Proactively cleanup local FS for RocksDBKeyedStateBackend on startup

Reply via email to