[
https://issues.apache.org/jira/browse/HBASE-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13119064#comment-13119064
]
[email protected] commented on HBASE-4496:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2136/#review2256
-----------------------------------------------------------
Ship it!
src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
<https://reviews.apache.org/r/2136/#comment5234>
an is not needed
- Ted
On 2011-09-30 20:41:01, Mikhail Bautin wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/2136/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-09-30 20:41:01)
bq.
bq.
bq. Review request for hbase, Jonathan Gray and Lars Hofhansl.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. This fixes a couple of long-existing code issues in HFile v2:
bq. - Making seekBefore cache the previous block it has to read when the
scanner happens to be at the first key of a block (this was a performance
regression introduced in HFile v2).
bq. - Fixing the accounting of the number of blocks read for the one-level
index case in HFileBlockIndex.seekToDataBlock if the current block is the same
as the requested block.
bq. - Getting rid of HFileBlock.BasicReader, which was used both by FSReaderV2
and HFileReaderV2, but the former did not cache blocks (a source of confusion).
bq. - Adding a new interface HFile.CachingBlockReader instead, which is
implemented by HFile readers and passed to HFileBlockIndex.
bq.
bq.
bq. This addresses bug HBASE-4496.
bq. https://issues.apache.org/jira/browse/HBASE-4496
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
4dc1367
bq. src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
5e98375
bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java b429819
bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
953896e
bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
13d5e70
bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
1cf7767
bq. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java eec566e
bq.
bq. Diff: https://reviews.apache.org/r/2136/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. This is in production in Facebook's hbase-89 branch.
bq.
bq. Still testing this open-source patch -- please don't commit yet.
bq.
bq.
bq. Thanks,
bq.
bq. Mikhail
bq.
bq.
> HFile V2 does not honor setCacheBlocks when scanning.
> -----------------------------------------------------
>
> Key: HBASE-4496
> URL: https://issues.apache.org/jira/browse/HBASE-4496
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.92.0, 0.94.0
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Fix For: 0.92.0, 0.94.0
>
> Attachments: 4496.txt
>
>
> While testing the LRU cache during the scanning I noticed quite some churn in
> the cache even when Scan.cacheBlocks is set to false. After debugging this, I
> found that HFile V2 always caches blocks in the LRU cache regardless of the
> cacheBlocks setting.
> Here's a trace (from Eclipse) showing the problem:
> HFileReaderV2.readBlock(long, int, boolean, boolean, boolean) line: 279
> HFileReaderV2.readBlockData(long, long, int, boolean) line: 219
> HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int,
> HFileBlock) line: 191
> HFileReaderV2$ScannerV2.seekTo(byte[], int, int, boolean) line: 502
> HFileReaderV2$ScannerV2.reseekTo(byte[], int, int) line: 539
> StoreFileScanner.reseekAtOrAfter(HFileScanner, KeyValue) line: 151
> StoreFileScanner.reseek(KeyValue) line: 110
> KeyValueHeap.reseek(KeyValue) line: 255
> StoreScanner.reseek(KeyValue) line: 409
> StoreScanner.next(List<KeyValue>, int) line: 304
> KeyValueHeap.next(List<KeyValue>, int) line: 114
> KeyValueHeap.next(List<KeyValue>) line: 143
> HRegion$RegionScannerImpl.nextRow(byte[]) line: 2774
> HRegion$RegionScannerImpl.nextInternal(int) line: 2722
> HRegion$RegionScannerImpl.next(List<KeyValue>, int) line: 2682
> HRegion$RegionScannerImpl.next(List<KeyValue>) line: 2699
> HRegionServer.next(long, int) line: 2092
> Every scanner.next causes a reseek, which eventually causes a call to
> HFileBlockIndex$BlockIndexReader.seekToDataBlock(...) at which point the
> cacheBlocks information is lost. HFileReaderV2.readBlockData calls
> HFileReaderV2.readBlock with cacheBlocks set unconditionally to true.
> The fix is not immediately clear, unless we want to pass cacheBlocks to
> HFileBlockIndex$BlockIndexReader.seekToDataBlock and then on to
> HFileBlock.BasicReader.readBlockData and all its implementers, which is ugly
> as readBlockData should not care about caching.
> Avoiding caching during scans is somewhat important for us.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira