[
https://issues.apache.org/jira/browse/HBASE-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15762788#comment-15762788
]
Sangjin Lee commented on HBASE-17018:
-------------------------------------
I am supportive of the design (obviously) as I consulted with Joep and provided
feedback. I like the overall approach and the proposed WIP patch.
I went over the patch at a high level and I do have several comments and
questions.
(1)
What's not entirely clear to me from the patch is exactly how the state
transition will occur, especially how it will transition out of BAD into
TRANSITIONING or GOOD. Maybe that is still TODO? Did I read that right? We
talked offline about attempting flushing periodically if it is in a bad state
to probe the state of the HBase cluster. One idea we could use is to use the
ExceptionListener/RetriesExhaustedWithDetailsException/mayHaveClusterIssues.
Also, we want to think about under what condition we transition from GOOD to
BAD. I still think the exception listener has a lot of value as it can tell us
(more) about the cluster status. We should see if we can utilize the exception
listener in determining whether to transition. We should also tune the timeout
value to the same timeout value we're willing to wait for these operations so
that in most cases we would detect the timeout in the form of an exception
coming from the underlying {{BufferedMutatorImpl}} with the proper exception
listener.
(2)
In {{SpoolingBufferedMutatorProcessor.call()}}, I'm not quite sure how useful
the two-tiered timeout setup is. It appears that the overall timeout is
something like 100 seconds, but we divide it with finer-grained 1-second timed
waits and keep looping until we exhaust the overall timeout. Is it truly
necessary? Do we gain value by having the two-tiered mechanism? Since this is
all done in the same single thread, it's not like the thread will do anything
other than looping right back onto {{future.get()}} on the same submission.
IMO, this seems to be adding more complexity than is needed with not much
payoff. Why not simply have a single timeout with the desired overall timeout?
That would make this much simpler without losing any flexibility.
(3)
Somewhat related to the above, I'd like to see the submission state used more
explicitly in that method. Currently a bad state and a subsequent behavior
difference are implied by {{timeout == 0}}. Instead, it might be great if we
explicitly use the state from the coordinator to do different things (e.g.
{{coordinator.getState()}} instead of {{coordinator.getTimeout()}}). We can add
things like the timeout value into the state enum so the state becomes more
useful.
(4)
Regarding the out-of-order mutate calls with respect to flushes, one idea might
be to extend the {{BlockingQueue}} (specifically override {{put()}}) so that
each {{put()}} call can handle the flush count synchronously and internally as
part of the call. Then we may be able to eliminate the need for handling
out-of-order mutate calls and thus simplify further.
> Spooling BufferedMutator
> ------------------------
>
> Key: HBASE-17018
> URL: https://issues.apache.org/jira/browse/HBASE-17018
> Project: HBase
> Issue Type: New Feature
> Reporter: Joep Rottinghuis
> Attachments: HBASE-17018.master.001.patch,
> HBASE-17018.master.002.patch, HBASE-17018.master.003.patch,
> HBASE-17018SpoolingBufferedMutatorDesign-v1.pdf, YARN-4061 HBase requirements
> for fault tolerant writer.pdf
>
>
> For Yarn Timeline Service v2 we use HBase as a backing store.
> A big concern we would like to address is what to do if HBase is
> (temporarily) down, for example in case of an HBase upgrade.
> Most of the high volume writes will be mostly on a best-effort basis, but
> occasionally we do a flush. Mainly during application lifecycle events,
> clients will call a flush on the timeline service API. In order to handle the
> volume of writes we use a BufferedMutator. When flush gets called on our API,
> we in turn call flush on the BufferedMutator.
> We would like our interface to HBase be able to spool the mutations to a
> filesystems in case of HBase errors. If we use the Hadoop filesystem
> interface, this can then be HDFS, gcs, s3, or any other distributed storage.
> The mutations can then later be re-played, for example through a MapReduce
> job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)