[
https://issues.apache.org/jira/browse/HBASE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830485#comment-13830485
]
stack commented on HBASE-9736:
------------------------------
Hmm.. My comments got lost... Let me redo. First, I'm trying it. Will report
back.
+ Random r = new Random();
+ int sleepTime = r.nextInt(500) + 500;
+ Thread.sleep(sleepTime);
+ } catch (InterruptedException e) {
+ LOG.warn("Interrupted while yielding for other region servers", e);
+ Thread.currentThread().interrupt();
Random is expensive to make. Keep around an instance? Seed it too else all
the Random's march in lock step?
FYI, there is a sleep in Threads that does the above if you want to use that
instead.
Do a define for this:
+ return (-1);
... since you repeat it in a few places?
Patch looks great.
> Alow more than one log splitter per RS
> --------------------------------------
>
> Key: HBASE-9736
> URL: https://issues.apache.org/jira/browse/HBASE-9736
> Project: HBase
> Issue Type: Improvement
> Components: MTTR
> Reporter: stack
> Assignee: Jeffrey Zhong
> Priority: Critical
> Attachments: hbase-9736.patch
>
>
> IIRC, this is an idea that came from the lads at Xiaomi.
> I have a small cluster of 6 RSs and one went down. It had a few WALs. I see
> this in logs:
> 2013-10-09 05:47:27,890 DEBUG org.apache.hadoop.hbase.master.SplitLogManager:
> total tasks = 25 unassigned = 21
> WAL splitting is held up for want of slots out on the cluster to split WALs.
> We need to be careful we don't overwhelm the foreground regionservers but
> more splitters should help get all back online faster.
--
This message was sent by Atlassian JIRA
(v6.1#6144)