[jira] [Commented] (HBASE-17018) Spooling BufferedMutator

Joep Rottinghuis (JIRA) Fri, 04 Nov 2016 22:58:37 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15638727#comment-15638727
 ]


Joep Rottinghuis commented on HBASE-17018:
------------------------------------------

Thanks for the comments.

My thought around using MR were because of easy of implementation and stemmed 
from my use case where Yarn is present and therefore MR trivially available. It 
is a fair point that as a standalone feature in HBase this doesn't have to be 
true. Using MR isn't a requirement, but was merely a (naive) suggestion.

I don't think that atomicity is a requirement, nor are we asking for 
"guarantees".
If you want to be guaranteed to write something to HBase you probably shouldn't 
use a BufferedMutator in the first place.

Please see attached PDF where I try to sketch out our use case and what 
behavior we're hoping to see.



> Spooling BufferedMutator
> ------------------------
>
>                 Key: HBASE-17018
>                 URL: https://issues.apache.org/jira/browse/HBASE-17018
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Joep Rottinghuis
>         Attachments: YARN-4061 HBase requirements for fault tolerant 
> writer.pdf
>
>
> For Yarn Timeline Service v2 we use HBase as a backing store.
> A big concern we would like to address is what to do if HBase is 
> (temporarily) down, for example in case of an HBase upgrade.
> Most of the high volume writes will be mostly on a best-effort basis, but 
> occasionally we do a flush. Mainly during application lifecycle events, 
> clients will call a flush on the timeline service API. In order to handle the 
> volume of writes we use a BufferedMutator. When flush gets called on our API, 
> we in turn call flush on the BufferedMutator.
> We would like our interface to HBase be able to spool the mutations to a 
> filesystems in case of HBase errors. If we use the Hadoop filesystem 
> interface, this can then be HDFS, gcs, s3, or any other distributed storage. 
> The mutations can then later be re-played, for example through a MapReduce 
> job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17018) Spooling BufferedMutator

Reply via email to