[
https://issues.apache.org/jira/browse/HBASE-9131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13730883#comment-13730883
]
Jonathan Hsieh commented on HBASE-9131:
---------------------------------------
[~zjushch] Thanks. I think we are somewhere between too little detail and too
much detail.
First, can we add the config variables to hbase-default.xml (with full
descriptions and with units).
Now to the meat:
The patch doesn't tell the admin why or when they'd want to consider using
this. The link/pdf requires having to search for the bucket cache sections in
the 2nd page and then goes on into too much design detail for an average admin.
(It also lacks the config variables / instructions).
My suggestion: Take let's take the high-level parts from section 3 of the pdf,
polish it and add it to the official docs.
Here's a stab at the sections that I think would be good for the ref guide with
the prose improved a little bit:
{quote}
*Design and Motivation*
The Bucket Cache is an alternate block cache implementation that is designed to
take advantage of large amounts of memory or low-latency storage. (something
about how big would be useful). It is implemented as an off-the-jvm-heap and
which has the secondary benefit of reducing JVM heap fragmentation that
eventually causes stop-the-world JVM garbage collection operations. If one were
to rely upon the standard JVM memory allocation and GC policies with large
heaps (>16GB RAM) one would periodically incur instability in hbase due to long
stop-the-world GC pauses (10's of secs to minutes) that can be misinterpreted
as region server failures.
The storage of cached blocks is is not constrained to in RAM-only use; one
could cache blocks in memory and also use a high speed disk, such as SSD's,
Fusion-IO devices, or ram-disks as massive secondary cache. (probably need
something about the persistence properties not being required, but having the
masssive capacity as a huge benefit.
Internally, the bucket cache divided storage into many *buckets*, each of which
contains blocks of a particular range of sizes. (this is a little fuzzy, needs
some clarification). Insertions and evictions of blocks backed by physical
storage just overwrites blocks on the device or reads data from the storage
device. Managing these larger blocks prevents external fragmentation that
causes GC pauses at the cost of some minor wasted space (internal
fragmentation).
*Configuration and Usage*
To configure the bucket cache... (something along the line of what the current
patch has)....
{quote}
Let me know what you think, and feel free to update/correct the draft.
> Add admin-level documention about configuration and usage of the Bucket Cache
> -----------------------------------------------------------------------------
>
> Key: HBASE-9131
> URL: https://issues.apache.org/jira/browse/HBASE-9131
> Project: HBase
> Issue Type: Bug
> Reporter: Jonathan Hsieh
> Attachments: hbase-9131.patch
>
>
> HBASE-7404 added the bucket cache but its configuration settings are
> currently undocumented. Without documentation developers would be the only
> ones aware of the feature.
> Specifically documentation about slide 23 from
> http://www.slideshare.net/cloudera/operations-session-4 would be great to add!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira