[
https://issues.apache.org/jira/browse/HBASE-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13766817#comment-13766817
]
Lars Hofhansl commented on HBASE-9440:
--------------------------------------
>From disk: 59s, block cache: 42s (didn't try just OS buffer cache)
This was with using a RowFilter to filter all KVs at the server. When returning
the data to the client it took 103s.
Take away:
# HBase is doing a reasonable job at interleaving IO and CPU (even though
scanning takes 40s longer, scanning from disk increased only by 20s).
# There is room for improvement even without changing the HFile format: 42s
frontdoor vs. 1.9s directly from HFile.
> Pass blocks of KVs from HFile scanner to the StoreFileScanner and up
> --------------------------------------------------------------------
>
> Key: HBASE-9440
> URL: https://issues.apache.org/jira/browse/HBASE-9440
> Project: HBase
> Issue Type: Bug
> Reporter: Lars Hofhansl
>
> Currently we read KVs from an HFileScanner one-by-one and pass them up the
> scanner/heap tree. Many time the ranges of KVs retrieved from
> StoreFileScanner (by StoreScanners) and HFileScanner (by StoreFileScanner)
> will be non-overlapping. If chunks of KVs do not overlap we can sort entire
> chunks just by comparing the start/end key of the chunk. Only if chunks are
> overlapping do we need to sort KV by KV as we do now.
> I have no patch, but I wanted to float this idea.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira