[GitHub] [hive] amansinha100 commented on a diff in pull request #3219: [WIP] HIVE-26147 orc raw record merger fix

GitBox Mon, 18 Apr 2022 19:40:25 -0700


amansinha100 commented on code in PR #3219:
URL: https://github.com/apache/hive/pull/3219#discussion_r852545241



##########
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java:
##########
@@ -763,15 +763,20 @@ private KeyInterval discoverOriginalKeyBounds(Reader 
reader, int bucket,
   }
 
   /**
-   * Find the key range for the split (of the base).  These are used to filter 
delta files since
-   * both are sorted by key.
+   * Find the key range for the split (of the base) based on the 
'hive.acid.key.index' metadata.
+   * These keys are used to filter delta files since both are sorted by key.
+   * If 'hive.acid.key.index' is missing from the ORC file, return null keys 
(which forces a full read).
    * @param reader the reader
    * @param options the options for reading with
-   * @throws IOException
    */
-  private KeyInterval discoverKeyBounds(Reader reader,
-                                 Reader.Options options) throws IOException {
-    RecordIdentifier[] keyIndex = OrcRecordUpdater.parseKeyIndex(reader);
+  private KeyInterval discoverKeyBounds(Reader reader, Reader.Options options) 
{
+    final RecordIdentifier[] keyIndex = OrcRecordUpdater.parseKeyIndex(reader);
+    if (keyIndex == null) {
+      LOG.warn("Missing '{}' metadata in ORC acid file, can't compute min/max 
keys",

Review Comment:
   nit: 'Instead of ORC 'acid' file, suggest just using ORC file since the 
table has acid property, not the file.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] amansinha100 commented on a diff in pull request #3219: [WIP] HIVE-26147 orc raw record merger fix

Reply via email to