StefanRRichter commented on a change in pull request #8263: 
[FLINK-12296][StateBackend]Data loss silently in RocksDBStateBackend because of 
local directory collision
URL: https://github.com/apache/flink/pull/8263#discussion_r279277040
 
 

 ##########
 File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/snapshot/RocksIncrementalSnapshotStrategy.java
 ##########
 @@ -184,17 +189,19 @@ private SnapshotDirectory 
prepareLocalSnapshotDirectory(long checkpointId) throw
                        LocalRecoveryDirectoryProvider directoryProvider = 
localRecoveryConfig.getLocalStateDirectoryProvider();
                        File directory = 
directoryProvider.subtaskSpecificCheckpointDirectory(checkpointId);
 
-                       if (directory.exists()) {
-                               FileUtils.deleteDirectory(directory);
-                       }
-
-                       if (!directory.mkdirs()) {
+                       if (!directory.exists() && !directory.mkdirs()) {
                                throw new IOException("Local state base 
directory for checkpoint " + checkpointId +
                                        " already exists: " + directory);
                        }
 
                        // introduces an extra directory because RocksDB wants 
a non-existing directory for native checkpoints.
-                       File rdbSnapshotDir = new File(directory, "rocks_db");
+                       // append operatorIdentifier here to solve directory 
collision problem when two stateful operators chained in one task.
+                       String subDir = 
operatorIdentifier.replaceAll("[^a-zA-Z0-9\\-]", "_");
 
 Review comment:
   Thinking about it a bit more, I wonder if it is a good idea to the the whole 
operator identifier string. It might be better to just use the operator id.
   
   Pros:
   - No need to handle illegal characters.
   - The description string can become quiete long. Given that the dir name is 
also already long, we can avoid running into any cases that paths go beyond the 
limits of certain filessystems.
   
   Cons:
   - This string is a bit more descriptive.
   
   What do you think?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to