[
https://issues.apache.org/jira/browse/HBASE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291174#comment-15291174
]
Yu Li commented on HBASE-14623:
-------------------------------
Recently we encountered some issue due to namespace table recovery blocked by
wal split of pre-holding RS, the sequence is like:
1. Many RS, rather than simply the single RS holding namespace, crashed due to
temporary network problem (causing all datanodes of pipeline bad), during a
*rolling upgrade*
2. Master restarted before DLS of RS previously holding region of namespace
table finished, stuck and finally aborted due to namespace region online
timeout ({{hbase.master.namespace.init.timeout}} default to 5min), see
{{TableNamespaceManager#start}}
I guess if we could add a similar mechanism to split and recover namespace
table earlier like meta table, we could avoid such problem:
{code:title=SplitLogWorker#taskLoop|borderStyle=solid}
// pick meta wal firstly
int offset = (int) (Math.random() * paths.size());
for (int i = 0; i < paths.size(); i++) {
if (DefaultWALProvider.isMetaFile(paths.get(i))) {
offset = i;
break;
}
}
{code}
So maybe this is a good reason for this JIRA to go in? Thanks.
> Implement dedicated WAL for system tables
> -----------------------------------------
>
> Key: HBASE-14623
> URL: https://issues.apache.org/jira/browse/HBASE-14623
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Reporter: Ted Yu
> Assignee: Ted Yu
> Labels: wal
> Fix For: 2.0.0
>
> Attachments: 14623-v1.txt, 14623-v2.txt, 14623-v2.txt, 14623-v2.txt,
> 14623-v2.txt, 14623-v3.txt, 14623-v4.txt
>
>
> As Stephen suggested in parent JIRA, dedicating separate WAL for system
> tables (other than hbase:meta) should be done in new JIRA.
> This task is to fulfill the system WAL separation.
> Below is summary of discussion:
> For system table to have its own WAL, we would recover system table faster
> (fast log split, fast log replay). It would probably benefit
> AssignmentManager on system table region assignment. At this time, the new
> AssignmentManager is not planned to change WAL. So the existence of this JIRA
> is good for overall system, not specific to AssignmentManager.
> There are 3 strategies for implementing system table WAL:
> 1. one WAL for all non-meta system tables
> 2. one WAL for each non-meta system table
> 3. one WAL for each region of non-meta system table
> Currently most system tables are one region table (only ACL table may become
> big). Choices 2 and 3 basically are the same.
> From implementation point of view, choices 2 and 3 are cleaner than choice 1
> (as we have already had 1 WAL for META table and we can reuse the logic).
> With choice 2 or 3, assignment manager performance should not be impacted and
> it would be easier for assignment manager to assign system table region (eg.
> without waiting for user table log split to complete for assigning system
> table region).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)