[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120094#comment-14120094
 ] 

Jonathan Hsieh commented on HBASE-11339:
----------------------------------------

Re: [~lhofhansl]

bq. To be fair, my comment itself addressed that by saying small blobs are 
stored by value in HBase, and only large bloba in HDFS. We can store a lot of 
10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not 
enough, we can dial up the threshold.
bq. It seems nobody understood what I am suggesting. Depending on use case and 
data distribution you pick a threshold X. Blobs with a size of < X are stored 
directly in HBase as a column value. Blobs >= X are stored in a HDFS with a 
reference in HBase using the 3-phase approach.

The MOB solution we're espousing does not preclude the hybrid hdfs+hbase 
approach - that could be still used with objects that are larger than or 
approach the hdfs block size.  Our claim is that the mob approach is 
complementary to a proper streaming api based hdfs+hbase mechanism for large 
object.  

Operationally, the MOB design is similar -- Depending on use case and data 
distribution you pick a threshold X on each column family.  Blobs with a size 
of < X are stored directly in HBase as a column value.  Blobs >= X are stored 
in the MOB area with a reference in HBase using the on-flush/on-compaction 
approach. If the blob is larger than the ~10MB default [1], it is rejected. 

With the MOB design, if the threshold X performs poorly, then you can alter 
table the X value and the next major compaction will shift values between the 
MOB area and the normal hbase regions.  With the HDFS+HBase approach, would we 
need a new mechanism to shift data between hdfs and hbase? Is there a simple 
tuning/migration story?

bq. True, but as I state the "store small blobs by value and only large ones by 
reference" solution is not mentioned in there.

bq. Not it's not... It says either all blobs go into HBase or all blobs go into 
HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS. 
That's key, nobody wants to store 100k or 1mb files directly in HDFS.

I'm confused.  Section 4.1.2 part this split was assumed and the different 
mechanisms were for handling the "large ones".  The discussions earlier in the 
jira explicitly added a threshold sizes to separate them when the value or 
reference implementations are used.

For people that want to put a lot of 100k or 1mb objects in hbase there are 
many problems that arise, and this mob feature is an approach to make this 
valid (according to the defaults) workload work better and more predictably.  
The mob design says store small blobs by value,  moderate blobs by reference 
(with data in to mob area), and maintains that hbase is not for large objects 
[1] . 

bq. Yet, all that is possible to do with a client only solution and could be 
abstracted there.
bq. I'll also admit that our blob storage tool is not finished, yet, and that 
for its use case we don't need replication or backup as it itself will be the 
backup solution for another very large data store.
bq. Are you guys absolutely... 100%... positive that this cannot be done in any 
other way and has to be done this way? That we cannot store files up to a 
certain size as values in HBase and larger files in HDFS? And there is not good 
threshold value for this?

I don't think that saying "this is the only way something could be done" is 
right thing to ask.  There always many ways to get a functionality -- we've 
presented a few other potential solutions, and have chosen and are justifying a 
design considering many of the tradeoffs.  It presented a need, a design, an 
early implementation, and evidence of a deployment and other potential use 
cases.

The hybrid hdfs-hbase approach is one of the alternatives. I believe we agree 
that there will be some complexity introduced with that approach dealing with 
atomicity, bulk load, security, backup, replication and potentially tuning.  We 
have enough detail from the discussion to handle atomicity, there are open 
questions with the others.  It is hard to claim a feature is production-ready 
if we don't have a relatively simple mechanism for backups and disaster 
recovery.  In some future, when the hybrid hdfs+hbase system gets open sourced 
along with operationally internalized tools complexities, I think it would be a 
fine addition to hbase. 

Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+ 
hbase by ref to hdfs.

[1] Today the default Cell size max is ~10MB. 
https://github.com/apache/hbase/blob/master/hbase-common/src/main/resources/hbase-default.xml#L530


> HBase MOB
> ---------
>
>                 Key: HBASE-11339
>                 URL: https://issues.apache.org/jira/browse/HBASE-11339
>             Project: HBase
>          Issue Type: Umbrella
>          Components: regionserver, Scanners
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase 
> MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user 
> guide_v2.docx, hbase-11339-in-dev.patch
>
>
>   It's quite useful to save the medium binary data like images, documents 
> into Apache HBase. Unfortunately directly saving the binary MOB(medium 
> object) to HBase leads to a worse performance since the frequent split and 
> compaction.
>   In this design, the MOB data are stored in an more efficient way, which 
> keeps a high write/read performance and guarantees the data consistency in 
> Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to