mwkang commented on code in PR #6900: URL: https://github.com/apache/hbase/pull/6900#discussion_r2039655382
########## hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java: ########## @@ -832,7 +845,7 @@ private void updateMetricsStore(boolean memstoreRead) { * @return null is the top cell doesn't change. Otherwise, the NextState to return */ private NextState needToReturn(List<? super ExtendedCell> outResult) { - if (!outResult.isEmpty() && topChanged) { + if (topChanged) { Review Comment: Let’s assume the data stored in HBase is as follows: (1) row0/family2:qf1/DeleteColumn (2) row0/family2:qf1/Put/value2 (3) row1/family1:qf1/Put/value2 (4) row1/family2:qf1/Put/value2 Now, suppose a user starts scanning from **row0**. In [`RegionScannerImpl#nextInternal`](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java#L415), when the [current cell](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java#L446)’s row is `row0`, after reading entry (2) in `StoreScanner`, if a flush happens, a `topChanged` occurs ([`Storescanner.peek() is changed where before ...`](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L1074)), and the value of `StoreScanner`’s `heap.peek()` becomes (4) `row1/family2:qf1/Put/value2`. Since it is the next row, `StoreScanner` *should* return at that point — but it fails to recognize that it has moved to the next row because [outResult](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java#L835) is empty, and ends up including the new row in the result. Then, in `RegionScannerImpl`, it sees that `nextKv`’s row is different from the current cell’s row, and returns (since it has moved to a different row). As a result, even though (3) and (4) belong to the same row (`row1`), they are returned to the client as if they were from different rows. (3) and (4) should be combined into a single [Result](https://github.com/apache/hbase/blob/0b3c17302843d1f4d6f3c6b458f837cb9c274510/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Result.java), but they end up being returned as separate Result instances. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org