virajjasani commented on code in PR #6557:
URL: https://github.com/apache/hbase/pull/6557#discussion_r1921849222
##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##########
@@ -71,15 +84,42 @@ public MatchCode match(ExtendedCell cell) throws
IOException {
if (includeDeleteMarker) {
this.deletes.add(cell);
}
- return MatchCode.SKIP;
+ // In some cases, optimization can not be done
+ if (!canOptimizeReadDeleteMarkers()) {
+ return MatchCode.SKIP;
+ }
}
- returnCode = checkDeleted(deletes, cell);
- if (returnCode != null) {
+ // optimization when prevCell is Delete or DeleteFamilyVersion
+ if ((returnCode = checkDeletedEffectively(cell, prevCell)) != null) {
+ return returnCode;
+ }
+ if ((returnCode = checkDeleted(deletes, cell)) != null) {
return returnCode;
}
return matchColumn(cell, timestamp, typeByte);
}
+ // If prevCell is a delete marker and cell is a delete marked Put or delete
marker,
+ // it means the cell is deleted effectively.
+ // And we can do SEEK_NEXT_COL.
+ private MatchCode checkDeletedEffectively(ExtendedCell cell, ExtendedCell
prevCell) {
+ if (
+ prevCell != null && canOptimizeReadDeleteMarkers()
+ && CellUtil.matchingRowColumn(prevCell, cell) &&
CellUtil.matchingTimestamp(prevCell, cell)
+ && (PrivateCellUtil.isDeleteType(prevCell)
+ || PrivateCellUtil.isDeleteFamilyVersion(prevCell))
+ ) {
+ return MatchCode.SEEK_NEXT_COL;
+ }
+ return null;
+ }
+
+ private boolean canOptimizeReadDeleteMarkers() {
+ // for simplicity, optimization works only for these cases
+ return !seePastDeleteMarkers && scanMaxVersions == 1 &&
!visibilityLabelEnabled
+ && getFilter() == null && !(deletes instanceof
NewVersionBehaviorTracker);
+ }
Review Comment:
Thanks @EungsopYoo, this is what I was also expecting.
On the Jira https://issues.apache.org/jira/browse/HBASE-25972, Kadir has
also provided how full scan is improvement is observed using PE (second comment
on the Jira). Could you also run the same steps to see how much improvement you
observe using this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]