[GitHub] [hive] SourabhBadhya commented on a diff in pull request #4672: HIVE-27672: Iceberg: Truncate partition support

via GitHub Fri, 15 Sep 2023 00:27:13 -0700


SourabhBadhya commented on code in PR #4672:
URL: https://github.com/apache/hive/pull/4672#discussion_r1326906833



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -1644,4 +1647,68 @@ public void 
validatePartSpec(org.apache.hadoop.hive.ql.metadata.Table hmsTable,
       }
     }
   }
+
+  // Metadata delete or a positional delete
+  @Override
+  public boolean shouldTruncate(org.apache.hadoop.hive.ql.metadata.Table 
hmsTable, Map<String, String> partitionSpec)
+      throws SemanticException {
+    Table table = IcebergTableUtil.getTable(conf, hmsTable.getTTable());
+    if (MapUtils.isEmpty(partitionSpec) || !isPartitionEvolution(table)) {
+      return true;
+    }
+
+    Map<String, PartitionField> partitionFieldMap = Maps.newHashMap();
+    for (PartitionField partField : table.spec().fields()) {
+      partitionFieldMap.put(partField.name(), partField);
+    }
+    Expression finalExp = Expressions.alwaysTrue();
+    for (Map.Entry<String, String> entry : partitionSpec.entrySet()) {
+      String partColName = entry.getKey();
+      if (partitionFieldMap.containsKey(partColName)) {
+        PartitionField partitionField = partitionFieldMap.get(partColName);
+        Type resultType = 
partitionField.transform().getResultType(table.schema()
+            .findField(partitionField.sourceId()).type());
+        TransformSpec.TransformType transformType = 
IcebergTableUtil.getTransformType(partitionField.transform());
+        Object value = Conversions.fromPartitionString(resultType, 
entry.getValue());
+        Iterable iterable = () -> Collections.singletonList(value).iterator();
+        if (transformType.equals(TransformSpec.TransformType.IDENTITY)) {
+          Expression boundPredicate = Expressions.in(partitionField.name(), 
iterable);
+          finalExp = Expressions.and(finalExp, boundPredicate);
+        } else {
+          throw new SemanticException(
+              String.format("Partition transforms are not supported via 
truncate operation: %s", partColName));
+        }
+      } else {
+        throw new SemanticException(String.format("No partition 
column/transform by the name: %s", partColName));
+      }
+    }
+    FindFiles.Builder builder = new 
FindFiles.Builder(table).withRecordsMatching(finalExp).includeColumnStats();
+    Set<DataFile> dataFiles = 
Sets.newHashSet(Iterables.transform(builder.collect(), file -> file));
+    boolean result = true;
+    for (DataFile dataFile : dataFiles) {
+      PartitionData partitionData = (PartitionData) dataFile.partition();
+      Expression residual = ResidualEvaluator.of(table.spec(), finalExp, false)
+          .residualFor(partitionData);
+      StrictMetricsEvaluator strictMetricsEvaluator = new 
StrictMetricsEvaluator(table.schema(), residual);
+      if (!strictMetricsEvaluator.eval(dataFile)) {
+        result = false;
+      }
+    }
+
+    boolean isV2Table = hmsTable.getParameters() != null &&
+        
"2".equals(hmsTable.getParameters().get(TableProperties.FORMAT_VERSION));
+    if (!result && !isV2Table) {
+      throw new SemanticException("Truncate conversion to delete is not 
possible since its not an Iceberg V2 table." +
+          " Consider converting the table to Iceberg's V2 format 
specification.");
+    }

Review Comment:
   We cant move this check at the beginning because we do not know if metadata 
delete is possible or not. Hence we need the `result` variable.
   We can perform truncate on V1 table, if its a metadata delete.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] SourabhBadhya commented on a diff in pull request #4672: HIVE-27672: Iceberg: Truncate partition support

Reply via email to