[
https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Dimiduk updated HBASE-11331:
---------------------------------
Attachment: HBASE-11331.02.patch
Thanks [~stack]. v02 addresses the exception you see here. It also:
- makes the feature configurable. enable with
{{hbase.block.data.cachecompressed=true}}
- only retains data blocks in their "packed" format in memory. Everything else
*should* be cached decompressed.
One observation: when running the RS with a compression-enabled table, I see a
lot of log messages says "got brand-new decompressor". I don't see any messages
saying "got recycled compressor" (i've enabled debug logging for
org.apache.hadoop.io.compress). I think we're not returning the objects to the
pool. Running this small PE workload results in ~1600 said log lines. Run the
same workload with hbase.block.data.cachecompressed=true, I see ~10k, so this
will have a significant impact on the performance of this feature. Will
investigate further.
> [blockcache] lazy block decompression
> -------------------------------------
>
> Key: HBASE-11331
> URL: https://issues.apache.org/jira/browse/HBASE-11331
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Nick Dimiduk
> Assignee: Nick Dimiduk
> Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch,
> HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf
>
>
> Maintaining data in its compressed form in the block cache will greatly
> increase our effective blockcache size and should show a meaning improvement
> in cache hit rates in well designed applications. The idea here is to lazily
> decompress/decrypt blocks when they're consumed, rather than as soon as
> they're pulled off of disk.
> This is related to but less invasive than HBASE-8894.
--
This message was sent by Atlassian JIRA
(v6.2#6252)