prateekm commented on code in PR #1676: URL: https://github.com/apache/samza/pull/1676#discussion_r1288932588
########## samza-core/src/main/java/org/apache/samza/storage/blobstore/BlobStoreRestoreManager.java: ########## @@ -254,8 +263,10 @@ static CompletableFuture<Void> restoreStores(String jobName, String jobId, TaskN throw new SamzaException(String.format("Error deleting store directory: %s", storeDir), e); } + // If getDeletedBlobs is enabled - always restore so that we get all the blobs, including the deleted blobs, + // immediately restore it locally and backup to create new checkpoint. boolean shouldRestore = shouldRestore(taskName.getTaskName(), storeName, dirIndex, - storeCheckpointDir, storageConfig, dirDiffUtil); + storeCheckpointDir, storageConfig, dirDiffUtil) || getDeletedBlobs; Review Comment: Yes, makes sense. To close the loop: Snapshot index blob is deleted after all no-longer-needed file blobs in it are deleted. We are using the failure to get snapshot index in init with getDeletedBlob = false as the signal that some of its contained blobs are missing and the entire store needs to be restored. This is also why the higher precedence for getDeletedBlobs is required. For the rare race condition where file blobs were deleted but snapshot index wasn't yet, restore manager init will work, restore will work, but a subsequent commit will fail and restart the container. By that time, the SnapshotIndex should be deleted by the orphaned container. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org