[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user bowenli86 closed the pull request at: https://github.com/apache/flink/pull/4798 ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4798#discussion_r146167573 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java --- @@ -235,26 +235,16 @@ public RocksDBKeyedStateBackend( this.instanceBasePath = Preconditions.checkNotNull(instanceBasePath); this.instanceRocksDBPath = new File(instanceBasePath, "db"); - // Clear this directory when the backend is created + // Clear the base directory when the backend is created // in case something crashed and the backend never reached dispose() - cleanInstanceBasePath(); - - if (!instanceBasePath.exists()) { + if (instanceBasePath.exists()) { + cleanInstanceBasePath(); --- End diff -- my bad... I'll break them into two `if` to make sure the `instanceBasePath` will always be created ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user StefanRRichter commented on a diff in the pull request: https://github.com/apache/flink/pull/4798#discussion_r146166380 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java --- @@ -235,26 +235,16 @@ public RocksDBKeyedStateBackend( this.instanceBasePath = Preconditions.checkNotNull(instanceBasePath); this.instanceRocksDBPath = new File(instanceBasePath, "db"); - // Clear this directory when the backend is created + // Clear the base directory when the backend is created // in case something crashed and the backend never reached dispose() - cleanInstanceBasePath(); - - if (!instanceBasePath.exists()) { + if (instanceBasePath.exists()) { + cleanInstanceBasePath(); --- End diff -- Unfortunately, this looks incorrect now, because the directory is actually deleted and the `else` branch is not hit. There is also a method to just clear directory content, but then different methods should be used here and in dispose (which can actually delete the directory). Nit: could merge `else`+ `if` into `else if`. ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4798#discussion_r146165645 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java --- @@ -235,6 +235,10 @@ public RocksDBKeyedStateBackend( this.instanceBasePath = Preconditions.checkNotNull(instanceBasePath); this.instanceRocksDBPath = new File(instanceBasePath, "db"); + // Clear this directory when the backend is created + // in case something crashed and the backend never reached dispose() + cleanInstanceBasePath(); + if (!instanceBasePath.exists()) { --- End diff -- You are right ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user bowenli86 commented on a diff in the pull request: https://github.com/apache/flink/pull/4798#discussion_r146165655 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java --- @@ -313,10 +317,16 @@ public void dispose() { IOUtils.closeQuietly(dbOptions); IOUtils.closeQuietly(columnOptions); + cleanInstanceBasePath(); --- End diff -- addressed ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user StefanRRichter commented on a diff in the pull request: https://github.com/apache/flink/pull/4798#discussion_r146161007 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java --- @@ -313,10 +317,16 @@ public void dispose() { IOUtils.closeQuietly(dbOptions); IOUtils.closeQuietly(columnOptions); + cleanInstanceBasePath(); --- End diff -- This again would not need the existence check that runs inside the method, because at this point the directory should always exist. ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
Github user StefanRRichter commented on a diff in the pull request: https://github.com/apache/flink/pull/4798#discussion_r146160907 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java --- @@ -235,6 +235,10 @@ public RocksDBKeyedStateBackend( this.instanceBasePath = Preconditions.checkNotNull(instanceBasePath); this.instanceRocksDBPath = new File(instanceBasePath, "db"); + // Clear this directory when the backend is created + // in case something crashed and the backend never reached dispose() + cleanInstanceBasePath(); + if (!instanceBasePath.exists()) { --- End diff -- nit: you could already integrate the deletion with this existence check, otherwise the check is often executed twice for no good reason. ---
[GitHub] flink pull request #4798: [FLINK-6505] Proactively cleanup local FS for Rock...
GitHub user bowenli86 opened a pull request: https://github.com/apache/flink/pull/4798 [FLINK-6505] Proactively cleanup local FS for RocksDBKeyedStateBackend on startup ## What is the purpose of the change In `RocksDBKeyedStateBackend`, the `instanceBasePath` is cleared on `dispose()`. It also make sense to also clear this directory when the backend is created, in case something crashed and the backend never reached `dispose()`. At least for previous runs of the same job, we can know what to delete on restart. In general, it is very important for this backend to clean up the local FS, because the local quota might be very limited compared to the DFS. And a node that runs out of local disk space can bring down the whole job, with no way to recover (it might always get rescheduled to that node). ## Brief change log clear `instanceBasePath` when `RocksDBKeyedStateBackend ` is created ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: none ## Documentation none You can merge this pull request into a Git repository by running: $ git pull https://github.com/bowenli86/flink FLINK-6505 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4798.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4798 commit 3c4ea092759f2df052cc3dc02403d041f4c16b2d Author: Bowen LiDate: 2017-10-10T05:31:17Z [FLIN-6505] Proactively cleanup local FS for RocksDBKeyedStateBackend on startup commit 22a761736d43114fb5b935d53df65bcf3832f02d Author: Bowen Li Date: 2017-10-11T10:41:24Z add comment ---