[jira] [Commented] (HBASE-17018) Spooling BufferedMutator

Joep Rottinghuis (JIRA) Tue, 13 Dec 2016 22:31:31 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15747443#comment-15747443
 ]


Joep Rottinghuis commented on HBASE-17018:
------------------------------------------

Filed HBASE-17313 with patch. If HBASE-17313 goes in first, then HBASE-17277 
should be updated to add the new field in the clone method (and in the unit 
test comparison), or visa versa.

bq.  Is there anything point in AP that could be exposed that might help 
simplify the implementation at all?
Good question. Not quite sure. I was thinking of adding two more things: a) an 
exception listener that can be used to capture exceptions from the 
BufferedMutatorImpl and pass them to the coordinator. I'll have to see if it is 
clear for the submission to catch these and shove all of this info into the 
Future to pass back, or if I want to have this all appear asynchronously in the 
coordinator. I think the former might be cleaner.

b) If the outbound queue reaches a certain size it should trigger a flush. As 
[~sjlee0] pointed out, the current design would allow for the outbound queue to 
grow very large if the user keeps sending mutations without calling flush. The 
BufferdMutatorImpl could have flushed, but we don't know that. I think the 
cleanest solution would be to call flush, but that cannot be a blocking call 
from the processor, otherwise we'll have a deadlock on our hands. I probably 
have to make flush on the SpoolingBufferedMutatorImpl have a boolean argument 
to block or not block.
Perhaps this is a bit where modifying the AP or the BMI to indicate that a 
size-based flush happened might be a good thing. On the other hand, to treat 
the BMI as a total black-box has a certain elegance...

> Spooling BufferedMutator
> ------------------------
>
>                 Key: HBASE-17018
>                 URL: https://issues.apache.org/jira/browse/HBASE-17018
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Joep Rottinghuis
>         Attachments: HBASE-17018.master.001.patch, 
> HBASE-17018SpoolingBufferedMutatorDesign-v1.pdf, YARN-4061 HBase requirements 
> for fault tolerant writer.pdf
>
>
> For Yarn Timeline Service v2 we use HBase as a backing store.
> A big concern we would like to address is what to do if HBase is 
> (temporarily) down, for example in case of an HBase upgrade.
> Most of the high volume writes will be mostly on a best-effort basis, but 
> occasionally we do a flush. Mainly during application lifecycle events, 
> clients will call a flush on the timeline service API. In order to handle the 
> volume of writes we use a BufferedMutator. When flush gets called on our API, 
> we in turn call flush on the BufferedMutator.
> We would like our interface to HBase be able to spool the mutations to a 
> filesystems in case of HBase errors. If we use the Hadoop filesystem 
> interface, this can then be HDFS, gcs, s3, or any other distributed storage. 
> The mutations can then later be re-played, for example through a MapReduce 
> job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-17018) Spooling BufferedMutator

Reply via email to