[
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032246#comment-14032246
]
Jingcheng Du commented on HBASE-11339:
--------------------------------------
Thanks [~jmhsieh] for the comments.
>Does the proposed design write out LOBs to both the HLog and then later LOB
>files?
Yes, the Lobs are written in both HLogs and Lob files.
>in the best case, the data is written at least twice – once before the ack is
>sent to the client and then again on flush. Can we limit this to once?
>We could avoid extra writes by just writing to a separate LOB log/file. Was
>this considered?
It was considered. But we didn't find a good solution for this.
>Is there any consideration of locality and performance?
The locality is only retained after the Lobs are flushed from the MemStore. But
it's not guaranteed after the SweepTool runs(Lob compaction) or regions move to
other regionservers.
The write/read performance of HBase is not supposed be be impacted too much, I
will provide the details later as soon as the performance testing is done.
>5MB cells are large but aren't really that big. Maybe this should just be
>"blobs" (binary large objects) or "mobs" (medium objects)? the objects being
>immutable is important too
Actually the Lobs could be mutable. The Lobs that are not used anymore will be
handled by the Sweep Tool.
> HBase LOB
> ---------
>
> Key: HBASE-11339
> URL: https://issues.apache.org/jira/browse/HBASE-11339
> Project: HBase
> Issue Type: New Feature
> Components: regionserver, Scanners
> Reporter: Jingcheng Du
> Assignee: Jingcheng Du
> Attachments: HBase LOB Design.pdf
>
>
> It's quite useful to save the massive binary data like images, documents
> into Apache HBase. Unfortunately directly saving the binary LOB(large object)
> to HBase leads to a worse performance since the frequent split and compaction.
> In this design, the LOB data are stored in an more efficient way, which
> keeps a high write/read performance and guarantees the data consistency in
> Apache HBase.
--
This message was sent by Atlassian JIRA
(v6.2#6252)