deniskuzZ commented on code in PR #6143:
URL: https://github.com/apache/hive/pull/6143#discussion_r2468903768
##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java:
##########
@@ -317,15 +322,20 @@ protected void addTblProperties(StringBuilder query,
Map<String, String> tblProp
private void buildAddClauseForAlter(StringBuilder query) {
if (validWriteIdList == null || dir == null) {
+ LOG.info("There is no delta to be added as partition to the temp
external table used by the minor compaction. " +
+ "This may result an empty compaction directory.");
query.setLength(0);
return; // avoid NPEs, don't throw an exception but return an empty
query
}
- long minWriteID = validWriteIdList.getMinOpenWriteId() == null ? 1 :
validWriteIdList.getMinOpenWriteId();
long highWatermark = validWriteIdList.getHighWatermark();
List<AcidUtils.ParsedDelta> deltas =
dir.getCurrentDirectories().stream().filter(
- delta -> delta.isDeleteDelta() == isDeleteDelta &&
delta.getMaxWriteId() <= highWatermark && delta.getMinWriteId() >= minWriteID)
+ delta -> delta.isDeleteDelta() == isDeleteDelta &&
delta.getMaxWriteId() <= highWatermark)
Review Comment:
if we remove the minOpenWriteId, we won't compact inflight writes.
ValidCompactorWriteIdList doest the trick
tblValidWriteIds =
TxnUtils.createValidCompactWriteIdList(msc.getValidWriteIds(
Collections.singletonList(fullTableName),
validTxnList.writeToString()).get(0));
````
if (tableValidWriteIds.isSetMinOpenWriteId()) {
long minOpenWriteId = tableValidWriteIds.getMinOpenWriteId();
return new ValidCompactorWriteIdList(fullTableName, exceptions, bitSet,
minOpenWriteId - 1, minOpenWriteId);
} else {
return new ValidCompactorWriteIdList(fullTableName, exceptions, bitSet,
tableValidWriteIds.getWriteIdHighWaterMark());
}
````
so basically highWatermark is already capped by minOpenWriteId - 1.
similar code from AbortedTxnCleaner
````
info.highestWriteId = Math.min(
isNull(validWriteIdList.getMinOpenWriteId()) ?
Long.MAX_VALUE : validWriteIdList.getMinOpenWriteId() - 1,
validWriteIdList.getHighWatermark());
````
1 thing i am still not sure is this:
MergeCompactor#getOutputDirPath
long minOpenWriteId = writeIds.getMinOpenWriteId() == null ? 1 :
writeIds.getMinOpenWriteId();
minOpenWriteId > highWatermark, unless it's not null, right?
i think that is wrong and should be derived from actual compacted/merged
deltas
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]