Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-07 Thread via GitHub


junegunn merged PR #8001:
URL: https://github.com/apache/hbase/pull/8001


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-04 Thread via GitHub


junegunn commented on code in PR #8001:
URL: https://github.com/apache/hbase/pull/8001#discussion_r3036328156


##
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##
@@ -70,9 +95,31 @@ public MatchCode match(ExtendedCell cell) throws IOException 
{
 seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : 
tr.withinOrAfterTimeRange(timestamp);
   if (includeDeleteMarker) {
 this.deletes.add(cell);
+// A DeleteColumn or DeleteFamily masks all remaining cells for this 
column/family.
+// Seek past them instead of skipping one cell at a time, but only 
after seeing
+// enough consecutive markers for the same column to justify the seek 
overhead.
+// Only safe with plain ScanDeleteTracker. Not safe with 
newVersionBehavior (sequence
+// IDs determine visibility), visibility labels (delete/put label 
mismatch), or
+// seePastDeleteMarkers (KEEP_DELETED_CELLS).
+if (
+  canSeekOnDeleteMarker && (typeByte == 
KeyValue.Type.DeleteFamily.getCode()
+|| (typeByte == KeyValue.Type.DeleteColumn.getCode() && 
cell.getQualifierLength() > 0))
+) {
+  if (lastDelete != null && !CellUtil.matchingQualifier(cell, 
lastDelete)) {
+rangeDeleteCount = 0;
+  }
+  lastDelete = cell;
+  if (++rangeDeleteCount >= SEEK_ON_DELETE_MARKER_THRESHOLD) {
+rangeDeleteCount = 0;
+return columns.getNextRowOrNextColumn(cell);
+  }
+} else {
+  rangeDeleteCount = 0;
+}

Review Comment:
   So here it is:
   
   ```ruby
   # The minimum timestamp of GET and SCAN will be set to this value
   future = (Time.now.to_i + 3600) * 1000
   benchmark(:DeleteColumnOldTimestamps, future) do |i|
 T.put(Put.new(ROW).addColumn(CF, CQ, future, VALUE)) if i.zero?
 T.delete(DC)
 flush 't' if (i % 100_000).zero? && i.positive?
   end
   ```
   
   - This benchmark code will `setTimeRange(future, MAX)` both for GET and SCAN
   - If we don't put an actual cell within the range, the scanner will be 
skipped entirely, so we "put" with the future timestamp
   
   This is a very contrived scenario, to be honest, but anyway, moving the 
block out of `includeDeleteMarker` condition (`HBASE-29039-alt-n10-mod`) helps 
a lot in this case.
   
   https://github.com/user-attachments/assets/ba5782c2-19ff-4967-8db7-31ee7335d8fa";
 />
   
   And no difference in other benchmarks.
   
   I updated the code and added a test case in 5b8afdc0fe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-04 Thread via GitHub


junegunn commented on code in PR #8001:
URL: https://github.com/apache/hbase/pull/8001#discussion_r3036328156


##
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##
@@ -70,9 +95,31 @@ public MatchCode match(ExtendedCell cell) throws IOException 
{
 seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : 
tr.withinOrAfterTimeRange(timestamp);
   if (includeDeleteMarker) {
 this.deletes.add(cell);
+// A DeleteColumn or DeleteFamily masks all remaining cells for this 
column/family.
+// Seek past them instead of skipping one cell at a time, but only 
after seeing
+// enough consecutive markers for the same column to justify the seek 
overhead.
+// Only safe with plain ScanDeleteTracker. Not safe with 
newVersionBehavior (sequence
+// IDs determine visibility), visibility labels (delete/put label 
mismatch), or
+// seePastDeleteMarkers (KEEP_DELETED_CELLS).
+if (
+  canSeekOnDeleteMarker && (typeByte == 
KeyValue.Type.DeleteFamily.getCode()
+|| (typeByte == KeyValue.Type.DeleteColumn.getCode() && 
cell.getQualifierLength() > 0))
+) {
+  if (lastDelete != null && !CellUtil.matchingQualifier(cell, 
lastDelete)) {
+rangeDeleteCount = 0;
+  }
+  lastDelete = cell;
+  if (++rangeDeleteCount >= SEEK_ON_DELETE_MARKER_THRESHOLD) {
+rangeDeleteCount = 0;
+return columns.getNextRowOrNextColumn(cell);
+  }
+} else {
+  rangeDeleteCount = 0;
+}

Review Comment:
   So here it is:
   
   ```ruby
   # The minimum timestamp of GET and SCAN will be set to this value
   future = (Time.now.to_i + 3600) * 1000
   benchmark(:DeleteColumnOldTimestamps, future) do |i|
 T.put(Put.new(ROW).addColumn(CF, CQ, future, VALUE)) if i.zero?
 T.delete(DC)
 flush 't' if (i % 100_000).zero? && i.positive?
   end
   ```
   
   - This benchmark code will `setTimeRange(future, MAX)` both for GET and SCAN
   - If we don't put an actual cell within the range, the scanner will be 
skipped entirely, so we "put" with the future timestamp
   
   This is a very contrived scenario, to be honest, but anyway, moving the 
block out of `includeDeleteMarker` condition (`HBASE-29039-alt-n10-mod`) helps 
a lot in this case.
   
   https://github.com/user-attachments/assets/ba5782c2-19ff-4967-8db7-31ee7335d8fa";
 />
   
   I updated the code and added a test case in 5b8afdc0fe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-04 Thread via GitHub


junegunn commented on code in PR #8001:
URL: https://github.com/apache/hbase/pull/8001#discussion_r3036328156


##
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##
@@ -70,9 +95,31 @@ public MatchCode match(ExtendedCell cell) throws IOException 
{
 seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : 
tr.withinOrAfterTimeRange(timestamp);
   if (includeDeleteMarker) {
 this.deletes.add(cell);
+// A DeleteColumn or DeleteFamily masks all remaining cells for this 
column/family.
+// Seek past them instead of skipping one cell at a time, but only 
after seeing
+// enough consecutive markers for the same column to justify the seek 
overhead.
+// Only safe with plain ScanDeleteTracker. Not safe with 
newVersionBehavior (sequence
+// IDs determine visibility), visibility labels (delete/put label 
mismatch), or
+// seePastDeleteMarkers (KEEP_DELETED_CELLS).
+if (
+  canSeekOnDeleteMarker && (typeByte == 
KeyValue.Type.DeleteFamily.getCode()
+|| (typeByte == KeyValue.Type.DeleteColumn.getCode() && 
cell.getQualifierLength() > 0))
+) {
+  if (lastDelete != null && !CellUtil.matchingQualifier(cell, 
lastDelete)) {
+rangeDeleteCount = 0;
+  }
+  lastDelete = cell;
+  if (++rangeDeleteCount >= SEEK_ON_DELETE_MARKER_THRESHOLD) {
+rangeDeleteCount = 0;
+return columns.getNextRowOrNextColumn(cell);
+  }
+} else {
+  rangeDeleteCount = 0;
+}

Review Comment:
   So here it is:
   
   ```ruby
   # The minimum timestamp of GET and SCAN will be set to this value
   future = (Time.now.to_i + 3600) * 1000
   benchmark(:DeleteColumnOldTimestamps, future) do |i|
 T.put(Put.new(ROW).addColumn(CF, CQ, future, VALUE)) if i.zero?
 T.delete(DC)
 flush 't' if (i % 100_000).zero? && i.positive?
   end
   ```
   
   - This benchmark code will `setTimeRange(future, MAX)` both for GET and SCAN
   - If we don't put an actual cell within the range, the scanner will be 
skipped entirely, so we "put" with the future timestamp
   
   This is a very contrived scenario, to be honest, but anyway, moving the 
block out of `includeDeleteMarker` condition (`HBASE-29039-alt-n10-mod`) helps 
a lot in this case.
   
   https://github.com/user-attachments/assets/ba5782c2-19ff-4967-8db7-31ee7335d8fa";
 />
   
   I updated the code and added a test case in 
73579e1b151c36ab99cc4332387fd81464305515.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-04 Thread via GitHub


junegunn commented on code in PR #8001:
URL: https://github.com/apache/hbase/pull/8001#discussion_r3036225452


##
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##
@@ -70,9 +95,31 @@ public MatchCode match(ExtendedCell cell) throws IOException 
{
 seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : 
tr.withinOrAfterTimeRange(timestamp);
   if (includeDeleteMarker) {
 this.deletes.add(cell);
+// A DeleteColumn or DeleteFamily masks all remaining cells for this 
column/family.
+// Seek past them instead of skipping one cell at a time, but only 
after seeing
+// enough consecutive markers for the same column to justify the seek 
overhead.
+// Only safe with plain ScanDeleteTracker. Not safe with 
newVersionBehavior (sequence
+// IDs determine visibility), visibility labels (delete/put label 
mismatch), or
+// seePastDeleteMarkers (KEEP_DELETED_CELLS).
+if (
+  canSeekOnDeleteMarker && (typeByte == 
KeyValue.Type.DeleteFamily.getCode()
+|| (typeByte == KeyValue.Type.DeleteColumn.getCode() && 
cell.getQualifierLength() > 0))
+) {
+  if (lastDelete != null && !CellUtil.matchingQualifier(cell, 
lastDelete)) {
+rangeDeleteCount = 0;
+  }
+  lastDelete = cell;
+  if (++rangeDeleteCount >= SEEK_ON_DELETE_MARKER_THRESHOLD) {
+rangeDeleteCount = 0;
+return columns.getNextRowOrNextColumn(cell);
+  }
+} else {
+  rangeDeleteCount = 0;
+}

Review Comment:
   Good point! I felt it was easier to reason about the code when it's inside 
this block, but on second thought, it should be safe to move it out of it, and 
it might provide performance benefit in some very rare cases. No cons I can 
think of. Let me see if I can come up with a contrived benchmark scenario to 
show the difference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-04 Thread via GitHub


junegunn commented on code in PR #8001:
URL: https://github.com/apache/hbase/pull/8001#discussion_r3036225452


##
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##
@@ -70,9 +95,31 @@ public MatchCode match(ExtendedCell cell) throws IOException 
{
 seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : 
tr.withinOrAfterTimeRange(timestamp);
   if (includeDeleteMarker) {
 this.deletes.add(cell);
+// A DeleteColumn or DeleteFamily masks all remaining cells for this 
column/family.
+// Seek past them instead of skipping one cell at a time, but only 
after seeing
+// enough consecutive markers for the same column to justify the seek 
overhead.
+// Only safe with plain ScanDeleteTracker. Not safe with 
newVersionBehavior (sequence
+// IDs determine visibility), visibility labels (delete/put label 
mismatch), or
+// seePastDeleteMarkers (KEEP_DELETED_CELLS).
+if (
+  canSeekOnDeleteMarker && (typeByte == 
KeyValue.Type.DeleteFamily.getCode()
+|| (typeByte == KeyValue.Type.DeleteColumn.getCode() && 
cell.getQualifierLength() > 0))
+) {
+  if (lastDelete != null && !CellUtil.matchingQualifier(cell, 
lastDelete)) {
+rangeDeleteCount = 0;
+  }
+  lastDelete = cell;
+  if (++rangeDeleteCount >= SEEK_ON_DELETE_MARKER_THRESHOLD) {
+rangeDeleteCount = 0;
+return columns.getNextRowOrNextColumn(cell);
+  }
+} else {
+  rangeDeleteCount = 0;
+}

Review Comment:
   Good point! I felt it was easier to reason about the code when it's inside 
this block, but on second thought, it should be safe to move it out of it, and 
it can actually provide performance benefit in some very rare cases. No cons I 
can think of. Let me see if I can come up with a contrived benchmark scenario 
to show the difference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-04 Thread via GitHub


charlesconnell commented on code in PR #8001:
URL: https://github.com/apache/hbase/pull/8001#discussion_r3036201036


##
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/querymatcher/NormalUserScanQueryMatcher.java:
##
@@ -70,9 +95,31 @@ public MatchCode match(ExtendedCell cell) throws IOException 
{
 seePastDeleteMarkers ? tr.withinTimeRange(timestamp) : 
tr.withinOrAfterTimeRange(timestamp);
   if (includeDeleteMarker) {
 this.deletes.add(cell);
+// A DeleteColumn or DeleteFamily masks all remaining cells for this 
column/family.
+// Seek past them instead of skipping one cell at a time, but only 
after seeing
+// enough consecutive markers for the same column to justify the seek 
overhead.
+// Only safe with plain ScanDeleteTracker. Not safe with 
newVersionBehavior (sequence
+// IDs determine visibility), visibility labels (delete/put label 
mismatch), or
+// seePastDeleteMarkers (KEEP_DELETED_CELLS).
+if (
+  canSeekOnDeleteMarker && (typeByte == 
KeyValue.Type.DeleteFamily.getCode()
+|| (typeByte == KeyValue.Type.DeleteColumn.getCode() && 
cell.getQualifierLength() > 0))
+) {
+  if (lastDelete != null && !CellUtil.matchingQualifier(cell, 
lastDelete)) {
+rangeDeleteCount = 0;
+  }
+  lastDelete = cell;
+  if (++rangeDeleteCount >= SEEK_ON_DELETE_MARKER_THRESHOLD) {
+rangeDeleteCount = 0;
+return columns.getNextRowOrNextColumn(cell);
+  }
+} else {
+  rangeDeleteCount = 0;
+}

Review Comment:
   Could put this new code outside of the `if (includeDeleteMarker)` block? Are 
there pros and cons to doing that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-02 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4181566882

   For those interested, here is a comparison with the previous work (#6557). 
See the `HBASE-29039` series without the `-alt` suffix.
   
   Both approaches effectively optimize read performance when delete markers 
are truly redundant:
   
   - https://github.com/user-attachments/assets/87b56c9f-777f-4d7e-8d9c-1ea73ff7f47d";
 />
   - https://github.com/user-attachments/assets/ec3783fa-4f26-40a2-897b-ba3367163ca2";
 />
   
   However, the previous work suffers from excessive seek overhead on false 
positives. This PR mitigates that overhead with qualifier comparison and an 
N=10 threshold.
   
   - https://github.com/user-attachments/assets/adc1ea77-9744-4320-954b-2aee9504928c";
 />
   - https://github.com/user-attachments/assets/649afab8-a1bb-452d-8742-9adafa00cfc5";
 />
   - https://github.com/user-attachments/assets/493e8304-e194-43a3-8ece-0e1c4944e168";
 />
   
   The full benchmark code can be found at 
https://gist.github.com/junegunn/bc0acf5269b8875330c0947dac7d0280.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-02 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4181375884

   _However_, even with qualifier comparison, false positives remain: exactly N 
consecutive redundant DCs for the same qualifier trigger an inefficient seek.
   
   - DC(q1) N=1 skip
   - DC(q1) N=2 skip
   - DC (q1) N=3 seek (false positive)
   - DC (q2) N=1 skip
   - DC (q2) N=2 skip
   - DC (q2) N=3 seek (false positive)
   - DC (q3) N=1 skip
   - DC (q3) N=2 skip
   - ...
   
   This should be rare in practice. But if overhead is a concern, increasing N 
is the only option.
   
   Here is a benchmark for this case, with an additional N=10 build:
   
   ```ruby
   benchmark(:DeleteColumnFalsePositiveEvery3) do |i|
 T.put(PUT) if i.zero?
   
 dc = Delete.new(ROW).addColumns(CF, (i / 3).to_s.to_java_bytes)
 T.delete(dc)
   
 flush 't' if (i % 100_000).zero? && i.positive?
   end
   ```
   
   https://github.com/user-attachments/assets/4fb939c6-4865-4da2-88ee-f25ad1312f95";
 />
   
   As expected, qualifier comparison does not help in this case, but a larger 
threshold (N=10) significantly reduces the overhead. Given the rarity of such 
scenarios, the overhead against master is acceptable.
   
   I believe 10 is a good threshold. This optimization targets cases where 
hundreds of thousands or millions of delete markers are swept, so the cost of 
10 extra skips is negligible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-02 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4179453905

   I updated the patch to compare qualifiers of contiguous delete markers, so 
the counter only increments for consecutive markers targeting the same column. 
With this, we don't need such a large N value to avoid the regression in the 
worst case.
   
   N=3 works correctly with this approach:
   
   - Same-column accumulation (the real problem): seeks after 3 markers. Fast 
kick-in.
   - Different-column DCs (false positive case): counter resets on qualifier 
change. All skip as before. No overhead.
   - One delete per row (common case): counter never reaches 3. Zero overhead.
   
   Even with qualifier comparison, false positives remain: exactly N 
consecutive redundant DCs for the same qualifier trigger a seek.
   
   > DC(q1) -skip-> DC(q1) -skip-> DC(q1) -seek-> DC(q2) -skip-> DC(q2) -skip-> 
DC(q2) -seek-> DC(q3)
   
   This should be rare in practice. If overhead is a concern, increasing N is 
the only alternative.
   
   Here are the results.
   
   - Regression in non-redundant DeleteFamily markers is fixed.
   - https://github.com/user-attachments/assets/1b7d193d-87c7-4611-b170-fe4e35edebf9";
 />
   - No overhead in the worst case
   - https://github.com/user-attachments/assets/d640a82a-b7c4-464a-93c5-e3700b9a7ec2";
 />
   - The best case benefit still holds
   - https://github.com/user-attachments/assets/ee606aee-29fb-4b72-bdbb-80d6d6c3e638";
 />
   - https://github.com/user-attachments/assets/577815a8-cfc0-49a6-abf3-bbd7436aa0bb";
 />
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-04-02 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4176485517

   I ran another test to truly check the SEEK overhead. In this case, we 
generate many `DeleteColumn`s for different qualifiers:
   
   ```ruby
   benchmark(:DeleteColumnFalsePositive) do |i|
 T.put(PUT) if i.zero?
   
 dc = Delete.new(ROW).addColumns(CF, i.to_s.to_java_bytes)
 T.delete(dc)
   
 # Let's manually flush after every 100,000 operations because it's hard to
 # fill up the memstore only with delete markers.
 flush 't' if (i % 100_000).zero? && i.positive?
   end
   ```
   
   - DC Q1
   - DC Q2
   - DC Q3
   - DC Q4
   - DC Q5
   - ...
   - Put Q0
   
   SEEK can only advance the pointer by only one cell, providing no advantage 
over SKIP. This is the worst case for this optimization.
   
   Testing revealed I underestimated the overhead. Increasing N to 10 helped 
reduce it though (see `alt-n10` case).
   
   https://github.com/user-attachments/assets/5a930194-097f-4449-b8eb-68b6e483a972";
 />
   
   To avoid regression in such cases, we need to either:
   - Choose a large N value (10?) so SEEK happens less frequently.
   - Or add qualifier comparison to the code, accepting the added complexity.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-03-29 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4151806753

   caed9ba264db7e4e634f892878c0582563dfbfba implements the heuristic with N = 3.
   
   The regression in normal case (no redundant delete markers) is fixed (see 
`HBASE-29039-alt-n3`):
   
   https://github.com/user-attachments/assets/95f8aaa8-c810-4ce0-a5f0-3292a3e1c124";
 />
   
   The performance benefit with many redundant delete markers remains:
   
   https://github.com/user-attachments/assets/123bdb8f-65a5-4b82-a56a-e3a19801a337";
 />
   
   https://github.com/user-attachments/assets/aa0636fe-3dde-48f8-8f8a-0bd25903f2e1";
 />
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-03-29 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4151701496

   I found a regression with this patch. When scanning across many rows where 
each row has only one `DeleteFamily` (or `DeleteColumn`) marker, scan 
performance degrades by ~50% compared to master. **The seek triggered by this 
optimization is more expensive than a simple skip when there's nothing to skip 
over.**
   
   The optimization helps when multiple delete markers accumulate for the same 
row or column. But for the common case of one delete per row, the seek is 
wasted and the overhead adds up across many rows.
   
   Benchmark data (scan time at 300K iterations, `DeleteFamily` on different 
rows):
   - master: ~0.2s
   - HBASE-29039-alt: ~0.3s
   
   https://github.com/user-attachments/assets/2f1aa636-5fdf-49d7-82e4-bdf4a3c8a603";
 />
   
   One possible approach: only seek on the second (or n-th) delete marker for 
the same scope. The first one would `SKIP` as before. If a second one appears 
(redundant), it signals accumulation and we switch to seek. This way:
   - One delete per row (common case): always skips, no regression
   - Accumulated deletes (the case we're optimizing): first one skips, rest seek
   
   Would this kind of heuristic make sense?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-03-28 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4149351136

   > TestVisibilityLabelsWithDeletes is failing
   
   Fixed by:
   
   ```diff
   -  !seePastDeleteMarkers && !(deletes instanceof 
NewVersionBehaviorTracker)
   +  !seePastDeleteMarkers && deletes.getClass() == 
ScanDeleteTracker.class
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-03-28 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4149332036

   TestVisibilityLabelsWithDeletes is failing, which likely explains the 
additional changes in https://github.com/apache/hbase/pull/6557. I'll try to 
fix it, but if it ends up resembling the previous approach, I'll drop this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-03-28 Thread via GitHub


junegunn commented on PR #8001:
URL: https://github.com/apache/hbase/pull/8001#issuecomment-4149331608

   TestVisibilityLabelsWithDeletes is failing, which likely explains the 
additional changes in https://github.com/apache/hbase/pull/6557. I’ll try to 
fix it, but if it ends up resembling the previous approach, I'll drop this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



[PR] HBASE-29039 Seek past delete markers instead of skipping one at a time [hbase]

2026-03-28 Thread via GitHub


junegunn opened a new pull request, #8001:
URL: https://github.com/apache/hbase/pull/8001

   ## Context
   
   HBASE-30036 (#7993) consolidates redundant delete markers on flush, 
preventing them from growing unbounded in HFiles. However, markers still 
accumulate in the memstore before flush, degrading read performance. 
HBASE-29039 addresses this from the read path side. Both are needed for full 
coverage. There is an open PR (#6557), but the review process has been stalled. 
This is an alternative approach with fewer code changes, hopefully making it 
easier to reach consensus.
   
   ## Test result
   
   Using the test code in https://issues.apache.org/jira/browse/HBASE-30036.
   
   ### `DeleteFamily`
   
   https://github.com/user-attachments/assets/7908b841-0eba-41f2-ac06-be57abd90d79";
 />
   
   - Substantial read performance improvement before flushes.
   - Without HBASE-30036, delete markers still accumulate in store files.
   
   ### `DeleteColumnContiguous`
   
   https://github.com/user-attachments/assets/4b012b94-1659-4e37-8000-cabab28d898a";
 />
   
   - Substantial read performance improvement before flushes.
   - Without HBASE-30036, delete markers still accumulate in store files.
   
   ### `DeleteColumnInterleaved`
   
   https://github.com/user-attachments/assets/cafa0ba9-3f93-496f-8a77-9da6d903aa1d";
 />
   
   - No difference, as expected. Already triggers SEEK_NEXT_COL via the masked 
put.
   
   ## Description
   
   When a DeleteColumn or DeleteFamily marker is encountered during a normal 
user scan, the matcher currently returns SKIP, forcing the scanner to advance 
one cell at a time. This causes read latency to degrade linearly with the 
number of accumulated delete markers for the same row or column.
   
   Since these are range deletes that mask all remaining versions of the 
column, seek past the entire column immediately via 
columns.getNextRowOrNextColumn(). This is safe because cells arrive in 
timestamp descending order, so any puts newer than the delete have already been 
processed.
   
   For DeleteFamily, also fix getKeyForNextColumn in ScanQueryMatcher to bypass 
the empty-qualifier guard (HBASE-18471) when the cell is a DeleteFamily marker. 
Without this, the seek barely advances past the current cell instead of jumping 
to the first real qualified column.
   
   The optimization is skipped when:
   - seePastDeleteMarkers is true (KEEP_DELETED_CELLS)
   - newVersionBehavior is enabled (sequence IDs determine visibility)
   - the delete marker is not tracked (visibility labels)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]