SourabhBadhya commented on code in PR #5076:
URL: https://github.com/apache/hive/pull/5076#discussion_r1505593646
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergInputFormat.java:
##########
@@ -178,7 +178,7 @@ public RecordReader<Void, Container<Record>>
getRecordReader(InputSplit split, J
@Override
public boolean shouldSkipCombine(Path path, Configuration conf) {
- return true;
+ return false;
Review Comment:
This function is used during generation of splits in CombineRecordReader
(during execution of merge tasks). It is a flag function which tells whether
merging is supported by the file format.
`hive.merge.tezfiles=false` will still work to disable merge functionality.
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java:
##########
@@ -163,6 +167,26 @@ public void commitTask(TaskAttemptContext originalContext)
throws IOException {
LOG.info("CommitTask found no serialized table in config for
table: {}.", output);
}
}, IOException.class);
+
+ // Merge task has merged several files into one. Hence we need to remove
the stale files.
Review Comment:
Done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]