[
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037864#comment-14037864
]
Jonathan Hsieh commented on HBASE-11339:
----------------------------------------
Thanks for following up with good questions!
You haven't called it out directly but your questions are leading towards
trouble spots in a loblog design. One has to do with atomicity and the other
has to do with reading recent values. I think the latter effectively
disqualifies the loblog idea. Here's a writeup.
bq. In this way, we save the Lob files as SequenceFiles, and save the offset
and file name back into the Put before putting the KV into the MemStore, right?
Essentially yes. They aren't necessarily sequence files -- they would be
synced to complete writing the lob just like the current hlog files does with
edits.
bq. 1. If so, we don't use the MemStore to save the Lob data, right? Then how
to read the Lob data that are not sync yet(which are still in the writer
buffer)?
If the loblog write and locator write into the hlog both succeed, we'd use the
same design/mechanism you currently have to read lobs that aren't present in
the memstore since they were flushed.
The difference is that the loblogs are still being written. In HDFS you can
read files that are currently being written, however you aren't guaranteed to
read to the most recent end of the file since we have no built in tail in hdfs
yet). Hm.. so we have a problem getting latest data.
So for the lob log design to be correct, it would need work on hdfs to provide
guarantees or a tail operation. While not out of the question, that would be a
ways out from now and disqualifies the lob log for the short term.
bq. 2. We need add a preSync and preAppend to the HLog so that we could sync
the Lob files before the HLogs are sync.
Explain why you need presync and preappend?
I think this is getting at a problem where we are trying to essentially sync
writes to two logs atomically. Could we just not issue the locator put until
the lob has been synced? (a lob that is just around won't hurt anything, but a
bad locator would). Both the lob and the locator would have the same
ts/mvcc/seqno.
In the PDF's design, this shouldn't be a problem because it would use the
normal write path for atomicity guarantees. Currently hbase guarantees
atomicity of CF's at flush time, and by having all cf:c's added to the hlog and
memstore atomically.
bq. In order to get the correct offset, we have to synchronize the prePut in
the coprocessor, or we could use different Lob files for each thread?
Why not just write+sync the lob and then write the locator put? For lobs we'd
use the same mechanism to sync (one loblog for all threads, queued using the
disruptor work).
> HBase MOB
> ---------
>
> Key: HBASE-11339
> URL: https://issues.apache.org/jira/browse/HBASE-11339
> Project: HBase
> Issue Type: New Feature
> Components: regionserver, Scanners
> Reporter: Jingcheng Du
> Assignee: Jingcheng Du
> Attachments: HBase LOB Design.pdf
>
>
> It's quite useful to save the medium binary data like images, documents
> into Apache HBase. Unfortunately directly saving the binary MOB(medium
> object) to HBase leads to a worse performance since the frequent split and
> compaction.
> In this design, the MOB data are stored in an more efficient way, which
> keeps a high write/read performance and guarantees the data consistency in
> Apache HBase.
--
This message was sent by Atlassian JIRA
(v6.2#6252)