[
https://issues.apache.org/jira/browse/GOBBLIN-1825?focusedWorklogId=859339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-859339
]
ASF GitHub Bot logged work on GOBBLIN-1825:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 27/Apr/23 07:53
Start Date: 27/Apr/23 07:53
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3687:
URL: https://github.com/apache/gobblin/pull/3687#discussion_r1178754124
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/version/HiveDatasetVersionCleaner.java:
##########
@@ -88,20 +88,24 @@ public void clean() throws IOException {
Partition partition = hiveDatasetVersion.getPartition();
try {
if (!cleanableHiveDataset.isSimulate()) {
+ if (cleanableHiveDataset.isShouldDeleteData()) {
+
cleanableHiveDataset.getFsCleanableHelper().clean(hiveDatasetVersion,
possiblyEmptyDirectories);
+ }
client.get().dropPartition(partition.getTable().getDbName(),
partition.getTable().getTableName(), partition.getValues(), false);
log.info("Successfully dropped partition " +
partition.getCompleteName());
} else {
log.info("Simulating drop partition " + partition.getCompleteName());
}
- if (cleanableHiveDataset.isShouldDeleteData()) {
-
cleanableHiveDataset.getFsCleanableHelper().clean(hiveDatasetVersion,
possiblyEmptyDirectories);
- }
} catch (TException | IOException e) {
log.warn(String.format("Failed to completely delete partition %s.",
partition.getCompleteName()), e);
throw new IOException(e);
}
+ } try {
Review Comment:
a `try` should begin on its own line
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/version/HiveDatasetVersionCleaner.java:
##########
@@ -88,20 +88,24 @@ public void clean() throws IOException {
Partition partition = hiveDatasetVersion.getPartition();
try {
if (!cleanableHiveDataset.isSimulate()) {
+ if (cleanableHiveDataset.isShouldDeleteData()) {
+
cleanableHiveDataset.getFsCleanableHelper().clean(hiveDatasetVersion,
possiblyEmptyDirectories);
+ }
client.get().dropPartition(partition.getTable().getDbName(),
partition.getTable().getTableName(), partition.getValues(), false);
Review Comment:
since this is tricky enough - to the extent that we're here fixing a logical
bug - please document w/ a comment that order really is critical (and why
that's so)
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/retention/version/HiveDatasetVersionCleaner.java:
##########
@@ -88,20 +88,24 @@ public void clean() throws IOException {
Partition partition = hiveDatasetVersion.getPartition();
try {
if (!cleanableHiveDataset.isSimulate()) {
+ if (cleanableHiveDataset.isShouldDeleteData()) {
+
cleanableHiveDataset.getFsCleanableHelper().clean(hiveDatasetVersion,
possiblyEmptyDirectories);
Review Comment:
are we certain this is an idempotent operation that won't throw an exception
the second and more time?
scenario:
1. the first time through it was deleted before dropping of the partition
failed
2. upon retry, before again trying to drop the partition, we invoke this
once more to delete a FS location that was already removed
Issue Time Tracking
-------------------
Worklog Id: (was: 859339)
Time Spent: 50m (was: 40m)
> Hive retention job should fail if deleting underlying files fail
> ----------------------------------------------------------------
>
> Key: GOBBLIN-1825
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1825
> Project: Apache Gobblin
> Issue Type: New Feature
> Reporter: Meeth Gala
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)