deniskuzZ commented on code in PR #6143:
URL: https://github.com/apache/hive/pull/6143#discussion_r2468903768


##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java:
##########
@@ -317,15 +322,20 @@ protected void addTblProperties(StringBuilder query, 
Map<String, String> tblProp
 
   private void buildAddClauseForAlter(StringBuilder query) {
     if (validWriteIdList == null || dir == null) {
+      LOG.info("There is no delta to be added as partition to the temp 
external table used by the minor compaction. " +
+          "This may result an empty compaction directory.");
       query.setLength(0);
       return;  // avoid NPEs, don't throw an exception but return an empty 
query
     }
-    long minWriteID = validWriteIdList.getMinOpenWriteId() == null ? 1 : 
validWriteIdList.getMinOpenWriteId();
     long highWatermark = validWriteIdList.getHighWatermark();
     List<AcidUtils.ParsedDelta> deltas = 
dir.getCurrentDirectories().stream().filter(
-            delta -> delta.isDeleteDelta() == isDeleteDelta && 
delta.getMaxWriteId() <= highWatermark && delta.getMinWriteId() >= minWriteID)
+            delta -> delta.isDeleteDelta() == isDeleteDelta && 
delta.getMaxWriteId() <= highWatermark)

Review Comment:
   if we remove the minOpenWriteId, we won't compact inflight writes. 
ValidCompactorWriteIdList doest the trick
   
   tblValidWriteIds = 
TxnUtils.createValidCompactWriteIdList(msc.getValidWriteIds(
             Collections.singletonList(fullTableName), 
              validTxnList.writeToString()).get(0));
   ````
   if (tableValidWriteIds.isSetMinOpenWriteId()) {
     long minOpenWriteId = tableValidWriteIds.getMinOpenWriteId();
     return new ValidCompactorWriteIdList(fullTableName, exceptions, bitSet, 
minOpenWriteId - 1, minOpenWriteId);
   } else {
     return new ValidCompactorWriteIdList(fullTableName, exceptions, bitSet, 
tableValidWriteIds.getWriteIdHighWaterMark());
   }
   ````
   so basically highWatermark is already capped by minOpenWriteId - 1. 
   
   similar code from AbortedTxnCleaner
   ````
    info.highestWriteId = Math.min(
           isNull(validWriteIdList.getMinOpenWriteId()) ?
               Long.MAX_VALUE : validWriteIdList.getMinOpenWriteId() - 1,
           validWriteIdList.getHighWatermark());
   ```` 
   
   1 thing i am still not sure is this: MergeCompactor#getOutputDirPath
   long minOpenWriteId = writeIds.getMinOpenWriteId() == null ? 1 : 
writeIds.getMinOpenWriteId();
   
   minOpenWriteId > highWatermark, unless it's not null, right?
   
   i think that is wrong and should be derived from actual compacted deltas
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to