SourabhBadhya commented on code in PR #5251:
URL: https://github.com/apache/hive/pull/5251#discussion_r1629517880


##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java:
##########
@@ -136,6 +145,7 @@ public void commitTask(TaskAttemptContext originalContext) 
throws IOException {
     ExecutorService tableExecutor = tableExecutor(jobConf, outputs.size());
     try {
       // Generates commit files for the target tables in parallel
+      Collection<Path> finalMergedPaths = new 
ConcurrentLinkedQueue<>(mergedPaths);

Review Comment:
   1. Modified it to a list. Done.
   2. This is done to ensure the information about input files used for merge 
is retained. This is null in most cases except for merge.



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java:
##########
@@ -459,9 +474,14 @@ private void commitTable(FileIO io, ExecutorService 
executor, OutputTable output
       deleteFiles.addAll(writeResults.deleteFiles());
       replacedDataFiles.addAll(writeResults.replacedDataFiles());
       referencedDataFiles.addAll(writeResults.referencedDataFiles());
+      mergedAndDeletedFiles.addAll(writeResults.mergedAndDeletedFiles());
     }
 
-    FilesForCommit filesForCommit = new FilesForCommit(dataFiles, deleteFiles, 
replacedDataFiles, referencedDataFiles);
+    dataFiles.removeIf(dataFile -> mergedAndDeletedFiles.contains(new 
Path(String.valueOf(dataFile.path()))));

Review Comment:
   While writing data, there are multiple `jobContexts`. The files from one 
jobContext can act as the input files for merge task which are written in 
another jobContexts. Hence to resolve them, its done during commit phase.



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java:
##########
@@ -203,7 +204,7 @@ public static long parseFilePosition(Record rec) {
     return rec.get(FILE_READ_META_COLS.get(MetadataColumns.ROW_POSITION), 
Long.class);
   }
 
-  public static long computeHash(StructProjection struct) {
+  public static long computeHash(StructLike struct) {

Review Comment:
   Removed. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to