Will-Lo commented on code in PR #3687:
URL: https://github.com/apache/gobblin/pull/3687#discussion_r1175633516
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/version/HiveDatasetVersionCleaner.java:
##########
@@ -87,21 +87,25 @@ public void clean() throws IOException {
try (AutoReturnableObject<IMetaStoreClient> client =
cleanableHiveDataset.getClientPool().getClient()) {
Partition partition = hiveDatasetVersion.getPartition();
try {
+ if (cleanableHiveDataset.isShouldDeleteData()) {
+
cleanableHiveDataset.getFsCleanableHelper().clean(hiveDatasetVersion,
possiblyEmptyDirectories);
+ }
Review Comment:
If we are doing this first before dropping the partitions, then we should
also check if the flow is a simulate flow. Otherwise we will delete the
underlying data before it simulates dropping the partition. Also I think this
was a bug in the existing code as well, before it would also delete the
underlying data if it was simulate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]