[
https://issues.apache.org/jira/browse/GOBBLIN-1825?focusedWorklogId=858758&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-858758
]
ASF GitHub Bot logged work on GOBBLIN-1825:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 24/Apr/23 18:14
Start Date: 24/Apr/23 18:14
Worklog Time Spent: 10m
Work Description: Will-Lo commented on code in PR #3687:
URL: https://github.com/apache/gobblin/pull/3687#discussion_r1175633516
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/version/HiveDatasetVersionCleaner.java:
##########
@@ -87,21 +87,25 @@ public void clean() throws IOException {
try (AutoReturnableObject<IMetaStoreClient> client =
cleanableHiveDataset.getClientPool().getClient()) {
Partition partition = hiveDatasetVersion.getPartition();
try {
+ if (cleanableHiveDataset.isShouldDeleteData()) {
+
cleanableHiveDataset.getFsCleanableHelper().clean(hiveDatasetVersion,
possiblyEmptyDirectories);
+ }
Review Comment:
If we are doing this first before dropping the partitions, then we should
also check if the flow is a simulate flow. Otherwise we will delete the
underlying data before it simulates dropping the partition. Also I think this
was a bug in the existing code as well, before it would also delete the
underlying data if it was simulate.
Issue Time Tracking
-------------------
Worklog Id: (was: 858758)
Time Spent: 20m (was: 10m)
> Hive retention job should fail if deleting underlying files fail
> ----------------------------------------------------------------
>
> Key: GOBBLIN-1825
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1825
> Project: Apache Gobblin
> Issue Type: New Feature
> Reporter: Meeth Gala
> Priority: Major
> Time Spent: 20m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)