difin commented on code in PR #3559:
URL: https://github.com/apache/hive/pull/3559#discussion_r972026124


##########
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java:
##########
@@ -104,28 +103,44 @@ public OrcSplit(Path path, Object fileId, long offset, 
long length, String[] hos
     this.isOriginal = isOriginal;
     this.hasBase = hasBase;
     this.rootDir = rootDir;
-    this.deltas.addAll(filterDeltasByBucketId(deltas, 
AcidUtils.parseBucketId(path)));
+    int bucketId = AcidUtils.parseBucketId(path);
+    long minWriteId = !deltas.isEmpty() ?
+            AcidUtils.parseBaseOrDeltaBucketFilename(path, 
null).getMinimumWriteId() : -1;
+    this.deltas.addAll(
+            deltas.stream()
+            .filter(delta -> isQualifiedDeleteDeltasByWriteIds(delta, 
minWriteId))

Review Comment:
   I was thinking it is better to have a method with a good self-describing 
name for more readability, but I agree that now that the method became very 
small it is better to do as you suggested. I did that and added a comment on 
top for explanation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to