klion26 commented on a change in pull request #8263:
[FLINK-12296][StateBackend]Data loss silently in RocksDBStateBackend because of
local directory collision
URL: https://github.com/apache/flink/pull/8263#discussion_r279298112
##########
File path:
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java
##########
@@ -184,17 +189,19 @@ private SnapshotDirectory
prepareLocalSnapshotDirectory(long checkpointId) throw
LocalRecoveryDirectoryProvider directoryProvider =
localRecoveryConfig.getLocalStateDirectoryProvider();
File directory =
directoryProvider.subtaskSpecificCheckpointDirectory(checkpointId);
- if (directory.exists()) {
- FileUtils.deleteDirectory(directory);
- }
-
- if (!directory.mkdirs()) {
+ if (!directory.exists() && !directory.mkdirs()) {
throw new IOException("Local state base
directory for checkpoint " + checkpointId +
" already exists: " + directory);
}
// introduces an extra directory because RocksDB wants
a non-existing directory for native checkpoints.
- File rdbSnapshotDir = new File(directory, "rocks_db");
+ // append operatorIdentifier here to solve directory
collision problem when two stateful operators chained in one task.
+ String subDir =
operatorIdentifier.replaceAll("[^a-zA-Z0-9\\-]", "_");
Review comment:
I think the Pros make sense to me.
If we use the operator id as the directory name, we need to extract the
operator id from operator identifier text, in this case, I want to add an
extract method in `OperatorSubtaskDescriptionText`.
In the other hand, I found there is a `backendUID` in
`RocksIncrementalSnapshotStrategy`, How about using 'backendUID' as the
directory name? If we think the `backendUID`(an UUID.randomUUID()) is too long,
we can erasure all the '-' character.
What do you think?
```
operatorID.toString() ---> d1c4061a511396721732590877b90cc6
UUID.randomUUID().toString() ---> 3dff9490-2957-431b-a682-c7cbb995e872
after erasuring '-' ---> 3dff94902957431ba682c7cbb995e872
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services