[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-11-23 Thread bowenli86
Github user bowenli86 closed the pull request at:

https://github.com/apache/flink/pull/4798


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-22 Thread bowenli86
Github user bowenli86 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4798#discussion_r146167573
  
--- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
@@ -235,26 +235,16 @@ public RocksDBKeyedStateBackend(
this.instanceBasePath = 
Preconditions.checkNotNull(instanceBasePath);
this.instanceRocksDBPath = new File(instanceBasePath, "db");
 
-   // Clear this directory when the backend is created
+   // Clear the base directory when the backend is created
// in case something crashed and the backend never reached 
dispose()
-   cleanInstanceBasePath();
-
-   if (!instanceBasePath.exists()) {
+   if (instanceBasePath.exists()) {
+   cleanInstanceBasePath();
--- End diff --

my bad... I'll break them into two `if` to make sure the `instanceBasePath` 
will always be created


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-22 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/4798#discussion_r146166380
  
--- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
@@ -235,26 +235,16 @@ public RocksDBKeyedStateBackend(
this.instanceBasePath = 
Preconditions.checkNotNull(instanceBasePath);
this.instanceRocksDBPath = new File(instanceBasePath, "db");
 
-   // Clear this directory when the backend is created
+   // Clear the base directory when the backend is created
// in case something crashed and the backend never reached 
dispose()
-   cleanInstanceBasePath();
-
-   if (!instanceBasePath.exists()) {
+   if (instanceBasePath.exists()) {
+   cleanInstanceBasePath();
--- End diff --

Unfortunately, this looks incorrect now, because the directory is actually 
deleted and the `else` branch is not hit. There is also a method to just clear 
directory content, but then different methods should be used here and in 
dispose (which can actually delete the directory).

Nit: could merge `else`+ `if` into `else if`.


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-22 Thread bowenli86
Github user bowenli86 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4798#discussion_r146165645
  
--- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
@@ -235,6 +235,10 @@ public RocksDBKeyedStateBackend(
this.instanceBasePath = 
Preconditions.checkNotNull(instanceBasePath);
this.instanceRocksDBPath = new File(instanceBasePath, "db");
 
+   // Clear this directory when the backend is created
+   // in case something crashed and the backend never reached 
dispose()
+   cleanInstanceBasePath();
+
if (!instanceBasePath.exists()) {
--- End diff --

You are right


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-22 Thread bowenli86
Github user bowenli86 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4798#discussion_r146165655
  
--- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
@@ -313,10 +317,16 @@ public void dispose() {
IOUtils.closeQuietly(dbOptions);
IOUtils.closeQuietly(columnOptions);
 
+   cleanInstanceBasePath();
--- End diff --

addressed


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-22 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/4798#discussion_r146161007
  
--- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
@@ -313,10 +317,16 @@ public void dispose() {
IOUtils.closeQuietly(dbOptions);
IOUtils.closeQuietly(columnOptions);
 
+   cleanInstanceBasePath();
--- End diff --

This again would not need the existence check that runs inside the method, 
because at this point the directory should always exist.


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-22 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/4798#discussion_r146160907
  
--- Diff: 
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
@@ -235,6 +235,10 @@ public RocksDBKeyedStateBackend(
this.instanceBasePath = 
Preconditions.checkNotNull(instanceBasePath);
this.instanceRocksDBPath = new File(instanceBasePath, "db");
 
+   // Clear this directory when the backend is created
+   // in case something crashed and the backend never reached 
dispose()
+   cleanInstanceBasePath();
+
if (!instanceBasePath.exists()) {
--- End diff --

nit: you could already integrate the deletion with this existence check, 
otherwise the check is often executed twice for no good reason.


---


[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...

2017-10-11 Thread bowenli86
GitHub user bowenli86 opened a pull request:

https://github.com/apache/flink/pull/4798

[FLINK-6505] Proactively cleanup local FS for RocksDBKeyedStateBackend on 
startup

## What is the purpose of the change

In `RocksDBKeyedStateBackend`, the `instanceBasePath` is cleared on 
`dispose()`. It also make sense to also clear this directory when the backend 
is created, in case something crashed and the backend never reached 
`dispose()`. At least for previous runs of the same job, we can know what to 
delete on restart. 

In general, it is very important for this backend to clean up the local FS, 
because the local quota might be very limited compared to the DFS. And a node 
that runs out of local disk space can bring down the whole job, with no way to 
recover (it might always get rescheduled to that node).

## Brief change log

clear `instanceBasePath` when `RocksDBKeyedStateBackend ` is created

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

none

## Documentation

none

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bowenli86/flink FLINK-6505

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4798.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4798


commit 3c4ea092759f2df052cc3dc02403d041f4c16b2d
Author: Bowen Li 
Date:   2017-10-10T05:31:17Z

[FLIN-6505] Proactively cleanup local FS for RocksDBKeyedStateBackend on 
startup

commit 22a761736d43114fb5b935d53df65bcf3832f02d
Author: Bowen Li 
Date:   2017-10-11T10:41:24Z

add comment




---