ramkrish86 commented on a change in pull request #2663:
URL: https://github.com/apache/hbase/pull/2663#discussion_r529410112
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
##########
@@ -786,6 +791,19 @@ private void updateMetricsStore(boolean memstoreRead) {
}
}
+ private void doSeekCol(Cell cell) throws IOException {
+ // we check when ever a seek_next_col happens did the seek really land in
a new block.
Review comment:
> What does this optimization do? Why next() instead of seekOrSkip...?
Next is always good if we know we are going with the actual next block only.
seekOrSkip involves loading the block and also doing a seek to the given key.
Not only that the code to do that decision does lot of compares. That adds to
the performance. Here we try to do a decision making with the fact that if the
seekOrSkip has really landed in the actual next block only- we tend to take
that as the pattern for the rest of the scan and go ahead with next and avoid
all those loading of the block and seeking of the block.
> Why does this optimization have to be done here instead of down in the
file scanner?
The reason is that I felt that the trackers are deciding what the store
scanner should be doing and hence asking to do a SEEK or NEXT. The storescanner
layer is trying to do the actual smart work of deciding whether to really do a
seek or next as per what the trackers are saying. Hence I added that logic at
this layer.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]