[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037864#comment-14037864
 ] 

Jonathan Hsieh commented on HBASE-11339:
----------------------------------------

Thanks for following up with good questions! 

You haven't called it out directly but your questions are leading towards 
trouble spots in a loblog design.  One has to do with atomicity and the other 
has to do with reading recent values.  I think the latter effectively 
disqualifies the loblog idea.  Here's a writeup.

bq. In this way, we save the Lob files as SequenceFiles, and save the offset 
and file name back into the Put before putting the KV into the MemStore, right?

Essentially yes.  They aren't necessarily sequence files -- they would be 
synced to complete writing the lob just like the current hlog files does with 
edits. 

bq. 1. If so, we don't use the MemStore to save the Lob data, right? Then how 
to read the Lob data that are not sync yet(which are still in the writer 
buffer)?

If the loblog write and locator write into the hlog both succeed, we'd use the 
same design/mechanism you currently have to read lobs that aren't present in 
the memstore since they were flushed.  

The difference is that the loblogs are still being written. In HDFS you can 
read files that are currently being written, however you aren't guaranteed to 
read to the most recent end of the file since we have no built in tail in hdfs 
yet).   Hm.. so we have a problem getting latest data.

So for the lob log design to be correct, it would need work on hdfs to provide 
guarantees or a tail operation.  While not out of the question, that would be a 
ways out from now and disqualifies the lob log for the short term.

bq. 2. We need add a preSync and preAppend to the HLog so that we could sync 
the Lob files before the HLogs are sync.

Explain why you need presync and preappend? 

I think this is getting at a problem where we are trying to essentially sync 
writes to two logs atomically. Could we just not issue the locator put until 
the lob has been synced?  (a lob that is just around won't hurt anything, but a 
bad locator would).  Both the lob and the locator would have the same 
ts/mvcc/seqno.

In the PDF's design, this shouldn't be a problem because it would use the 
normal write path for atomicity guarantees.  Currently hbase guarantees 
atomicity of CF's at flush time, and by having all cf:c's added to the hlog and 
memstore atomically.  

bq. In order to get the correct offset, we have to synchronize the prePut in 
the coprocessor, or we could use different Lob files for each thread?

Why not just write+sync the lob and then write the locator put?  For lobs we'd 
use the same mechanism to sync (one loblog for all threads, queued using the 
disruptor work).  


> HBase MOB
> ---------
>
>                 Key: HBASE-11339
>                 URL: https://issues.apache.org/jira/browse/HBASE-11339
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Scanners
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBase LOB Design.pdf
>
>
>   It's quite useful to save the medium binary data like images, documents 
> into Apache HBase. Unfortunately directly saving the binary MOB(medium 
> object) to HBase leads to a worse performance since the frequent split and 
> compaction.
>   In this design, the MOB data are stored in an more efficient way, which 
> keeps a high write/read performance and guarantees the data consistency in 
> Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to