[
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120094#comment-14120094
]
Jonathan Hsieh commented on HBASE-11339:
----------------------------------------
Re: [~lhofhansl]
bq. To be fair, my comment itself addressed that by saying small blobs are
stored by value in HBase, and only large bloba in HDFS. We can store a lot of
10MB (in the worst case scenario it's 200m x 10mb = 2pb) in HDFS, if that's not
enough, we can dial up the threshold.
bq. It seems nobody understood what I am suggesting. Depending on use case and
data distribution you pick a threshold X. Blobs with a size of < X are stored
directly in HBase as a column value. Blobs >= X are stored in a HDFS with a
reference in HBase using the 3-phase approach.
The MOB solution we're espousing does not preclude the hybrid hdfs+hbase
approach - that could be still used with objects that are larger than or
approach the hdfs block size. Our claim is that the mob approach is
complementary to a proper streaming api based hdfs+hbase mechanism for large
object.
Operationally, the MOB design is similar -- Depending on use case and data
distribution you pick a threshold X on each column family. Blobs with a size
of < X are stored directly in HBase as a column value. Blobs >= X are stored
in the MOB area with a reference in HBase using the on-flush/on-compaction
approach. If the blob is larger than the ~10MB default [1], it is rejected.
With the MOB design, if the threshold X performs poorly, then you can alter
table the X value and the next major compaction will shift values between the
MOB area and the normal hbase regions. With the HDFS+HBase approach, would we
need a new mechanism to shift data between hdfs and hbase? Is there a simple
tuning/migration story?
bq. True, but as I state the "store small blobs by value and only large ones by
reference" solution is not mentioned in there.
bq. Not it's not... It says either all blobs go into HBase or all blobs go into
HDFS... See above. Small blobs would be stored directly in HBase, not in HDFS.
That's key, nobody wants to store 100k or 1mb files directly in HDFS.
I'm confused. Section 4.1.2 part this split was assumed and the different
mechanisms were for handling the "large ones". The discussions earlier in the
jira explicitly added a threshold sizes to separate them when the value or
reference implementations are used.
For people that want to put a lot of 100k or 1mb objects in hbase there are
many problems that arise, and this mob feature is an approach to make this
valid (according to the defaults) workload work better and more predictably.
The mob design says store small blobs by value, moderate blobs by reference
(with data in to mob area), and maintains that hbase is not for large objects
[1] .
bq. Yet, all that is possible to do with a client only solution and could be
abstracted there.
bq. I'll also admit that our blob storage tool is not finished, yet, and that
for its use case we don't need replication or backup as it itself will be the
backup solution for another very large data store.
bq. Are you guys absolutely... 100%... positive that this cannot be done in any
other way and has to be done this way? That we cannot store files up to a
certain size as values in HBase and larger files in HDFS? And there is not good
threshold value for this?
I don't think that saying "this is the only way something could be done" is
right thing to ask. There always many ways to get a functionality -- we've
presented a few other potential solutions, and have chosen and are justifying a
design considering many of the tradeoffs. It presented a need, a design, an
early implementation, and evidence of a deployment and other potential use
cases.
The hybrid hdfs-hbase approach is one of the alternatives. I believe we agree
that there will be some complexity introduced with that approach dealing with
atomicity, bulk load, security, backup, replication and potentially tuning. We
have enough detail from the discussion to handle atomicity, there are open
questions with the others. It is hard to claim a feature is production-ready
if we don't have a relatively simple mechanism for backups and disaster
recovery. In some future, when the hybrid hdfs+hbase system gets open sourced
along with operationally internalized tools complexities, I think it would be a
fine addition to hbase.
Rough thresholds would be 0-100k hbase by value, 100k-10MB hbase by mob, 10MB+
hbase by ref to hdfs.
[1] Today the default Cell size max is ~10MB.
https://github.com/apache/hbase/blob/master/hbase-common/src/main/resources/hbase-default.xml#L530
> HBase MOB
> ---------
>
> Key: HBASE-11339
> URL: https://issues.apache.org/jira/browse/HBASE-11339
> Project: HBase
> Issue Type: Umbrella
> Components: regionserver, Scanners
> Reporter: Jingcheng Du
> Assignee: Jingcheng Du
> Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase
> MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide.docx, MOB user
> guide_v2.docx, hbase-11339-in-dev.patch
>
>
> It's quite useful to save the medium binary data like images, documents
> into Apache HBase. Unfortunately directly saving the binary MOB(medium
> object) to HBase leads to a worse performance since the frequent split and
> compaction.
> In this design, the MOB data are stored in an more efficient way, which
> keeps a high write/read performance and guarantees the data consistency in
> Apache HBase.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)