kuczoram commented on code in PR #6143:
URL: https://github.com/apache/hive/pull/6143#discussion_r2472164496
##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java:
##########
@@ -317,15 +322,20 @@ protected void addTblProperties(StringBuilder query,
Map<String, String> tblProp
private void buildAddClauseForAlter(StringBuilder query) {
if (validWriteIdList == null || dir == null) {
+ LOG.info("There is no delta to be added as partition to the temp
external table used by the minor compaction. " +
+ "This may result an empty compaction directory.");
query.setLength(0);
return; // avoid NPEs, don't throw an exception but return an empty
query
}
- long minWriteID = validWriteIdList.getMinOpenWriteId() == null ? 1 :
validWriteIdList.getMinOpenWriteId();
long highWatermark = validWriteIdList.getHighWatermark();
List<AcidUtils.ParsedDelta> deltas =
dir.getCurrentDirectories().stream().filter(
- delta -> delta.isDeleteDelta() == isDeleteDelta &&
delta.getMaxWriteId() <= highWatermark && delta.getMinWriteId() >= minWriteID)
+ delta -> delta.isDeleteDelta() == isDeleteDelta &&
delta.getMaxWriteId() <= highWatermark)
Review Comment:
yeah, you are absolutely right. This is the same misuse of minOpenWriteId as
in the minor compaction. It is used as the smallest writeId to be compacted,
but actually it is not. It should compact deltas below the minOpenWriteId, so
as you wrote the highWaterMark is adjusted accordingly. And yeah, it should be
minOpenWriteId > highWatermark if it is not null.
In the minor compaction when the output directory is created, it gets the
minimum writeId from the delta directories, but here it is not the case. Thanks
for finding this!! I will check and fix that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]