kuczoram commented on code in PR #6143:
URL: https://github.com/apache/hive/pull/6143#discussion_r2472164496


##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java:
##########
@@ -317,15 +322,20 @@ protected void addTblProperties(StringBuilder query, 
Map<String, String> tblProp
 
   private void buildAddClauseForAlter(StringBuilder query) {
     if (validWriteIdList == null || dir == null) {
+      LOG.info("There is no delta to be added as partition to the temp 
external table used by the minor compaction. " +
+          "This may result an empty compaction directory.");
       query.setLength(0);
       return;  // avoid NPEs, don't throw an exception but return an empty 
query
     }
-    long minWriteID = validWriteIdList.getMinOpenWriteId() == null ? 1 : 
validWriteIdList.getMinOpenWriteId();
     long highWatermark = validWriteIdList.getHighWatermark();
     List<AcidUtils.ParsedDelta> deltas = 
dir.getCurrentDirectories().stream().filter(
-            delta -> delta.isDeleteDelta() == isDeleteDelta && 
delta.getMaxWriteId() <= highWatermark && delta.getMinWriteId() >= minWriteID)
+            delta -> delta.isDeleteDelta() == isDeleteDelta && 
delta.getMaxWriteId() <= highWatermark)

Review Comment:
   yeah, you are absolutely right. This is the same misuse of minOpenWriteId as 
in the minor compaction. It is used as the smallest writeId to be compacted, 
but actually it is not. It should compact deltas below the minOpenWriteId, so 
as you wrote the highWaterMark is adjusted accordingly. And yeah, it should be 
minOpenWriteId > highWatermark if it is not null.
   In the minor compaction when the output directory is created, it gets the 
minimum writeId from the delta directories, but here it is not the case. Thanks 
for finding this!! I will check and fix that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to