[GitHub] [hive] marton-bod commented on a change in pull request #2161: HIVE-25006: Commit Iceberg writes in HiveMetaHook instead of TezAM

GitBox Tue, 20 Apr 2021 04:50:43 -0700


marton-bod commented on a change in pull request #2161:
URL: https://github.com/apache/hive/pull/2161#discussion_r616608877




##########
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java
##########
@@ -105,13 +105,18 @@ public void commitTask(TaskAttemptContext 
originalContext) throws IOException {
           .executeWith(tableExecutor)
           .run(output -> {
             Table table = 
HiveIcebergStorageHandler.table(context.getJobConf(), output);
-            HiveIcebergRecordWriter writer = writers.get(output);
-            DataFile[] closedFiles = writer != null ? writer.dataFiles() : new 
DataFile[0];
-            String fileForCommitLocation = 
generateFileForCommitLocation(table.location(), jobConf,
-                attemptID.getJobID(), attemptID.getTaskID().getId());
-
-            // Creating the file containing the data files generated by this 
task for this table
-            createFileForCommit(closedFiles, fileForCommitLocation, 
table.io());
+            if (table != null) {

Review comment:
       This happens during task commit, so before the commitInsert hook is 
called. 
   
   The essential problem here is that `OUTPUT_TABLES` contains all the tables, 
however, only those tables are serialized into the jobconfig that are relevant 
for the given task. So it tries to iterate over 1...N tables, but only has 
access to serialized Table 1 (hence the if).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] marton-bod commented on a change in pull request #2161: HIVE-25006: Commit Iceberg writes in HiveMetaHook instead of TezAM

Reply via email to