[jira] [Commented] (HBASE-11339) HBase MOB

Jonathan Hsieh (JIRA) Tue, 19 Aug 2014 07:02:02 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102218#comment-14102218
 ]


Jonathan Hsieh commented on HBASE-11339:
----------------------------------------

[~jiajia], thanks for the update to the user guide.  I think it has the key 
details points (the whats) needed for a user who already understands what a MOB 
is and is for.  We should add some context for users (the why's and the bigger 
picture) that aren't familiar with it thought but adding some background into 
this user doc. We'll eventually fold into the ref guide here[1].

Let me provide a quick draft that we could build off of.

Before Bullet we should have some info (this is a paraphrased version of the 
design doc's intro.

{quote}
Data comes in many sizes, and it is convenient to save the binary data like 
images, documents into the HBase. While HBase can handle binary objects with 
cells that are 1 byte to 10MB long, HBase's normal read and write paths are 
optimized for values smaller than 100KB in size.  When HBase deals with large 
numbers of values > 100kb and up to ~10MB of data, it encounters performance 
degradations due to write amplification caused by splits and compactions.  
HBase 2.0+ has added support for better managing large numbers of *Medium 
Objects* (MOBs) that maintains the same high performance,  strongly 
consistently characteristics with low operational overhead.

To enable the feature, one must enable and config the mob components in each 
region server and enable the mob feature on particular column families during 
table creation or table alter.  Also in the preview version of this feature, 
the admin must setup periodic processes that re-optimizes the layout of mob 
data.

Section: Enabling and Configuring the mob feature on region servers.

Need to enable feature in flushes and compactions.  Tuning settings on caches.

user doc bullet 1. edit hbase-site...
user doc bullet 7. mob cache

Would be nice to have an examples of doing this from the shell -- an example of 
creating a table with mob on a cf, and an example of a table alter that changes 
a cf to use the mob path.

Section: Mob management

The mob feature introduces a new read and write path to hbase and in its 
current incarnation requires external tools for housekeeping and 
reoptimization.  There are two tools introduced -- the expiredMobFileCleaner 
for handling ttls and time based expiry of data, and the sweep tool for 
coalescing small mob files or mob files with many deletions or updates.

user doc bullet 8.

Section: Enabling the mob feature on user tables

This can be done when creating a table or when altering a table

user doc bullet 2 (set cf with mob)
user doc bullet 6 (threshold size)

To a client, mob cells act just like normal cells.

user doc bullet 3 put
user doc bullet 4 scan

There is a special scanner mode users can use to read the raw values

user doc bullet 5.

{quote}

[1] http://hbase.apache.org/book.html

> HBase MOB
> ---------
>
>                 Key: HBASE-11339
>                 URL: https://issues.apache.org/jira/browse/HBASE-11339
>             Project: HBase
>          Issue Type: Umbrella
>          Components: regionserver, Scanners
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
> MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide .docx, 
> hbase-11339-in-dev.patch
>
>
>   It's quite useful to save the medium binary data like images, documents 
> into Apache HBase. Unfortunately directly saving the binary MOB(medium 
> object) to HBase leads to a worse performance since the frequent split and 
> compaction.
>   In this design, the MOB data are stored in an more efficient way, which 
> keeps a high write/read performance and guarantees the data consistency in 
> Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11339) HBase MOB

Reply via email to