[
https://issues.apache.org/jira/browse/HBASE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15206670#comment-15206670
]
Ted Yu commented on HBASE-14623:
--------------------------------
bq. Why would log split and replay be 'faster'?
For log split, WAL edits for system tables are not mixed with edits from user
tables. This reduces the amount of data to be split greatly.
For WAL replay, the benefit comes from replaying edits for system table ahead
of edits for user table.
bq. Why would we recover the system tables 'faster'?
With WAL replay for system table finishing before replaying edits for user
tables, system table assignment can take place at earlier stage in the cluster
recovery.
bq. It would probably benefit AssignmentManager on system table region
assignment.
Stephen has more details backing the above statement.
bq. Should we compound all system tables so assignment is easier rather than
have a small system table per domain
Compounding system tables requires understanding of needs for each of the
system tables. Not sure this is within scope of the JIRA.
bq. In particular, how will this not slow down assign (if each system table has
to wait on its own log to finish split
Currently with distributed log splitting, not only would log splitting for
system table (such as hbase:namespace) have to finish, but also log splitting
for user tables has to complete before tables are assigned.
In this regard, there is no slow down in assignment.
bq. Any testing done?
I need to load up tar ball built with patch and see the effect on cluster.
bq. meta WAL handling goes untouched?
That's right.
bq. We in effect copy/paste the hbase:meta handling?
Need to go over the patch in detail (it has been almost 4 months). I started by
referencing hbase:meta handling. Later on, I addressed WAL provider integration
where system table is allowed to have more than one region.
bq. There is a logroller for user-space WALs, one for meta and then another for
system tables?
This is an interesting observation. Considering the potential for more system
tables to be added (hbase:backup e.g.), I think it makes sense to have another
log roller for the system tables since the edits for system tables can be quite
large.
bq. We have enough threads running in the system already.
Yes. And more is being added along with new features. If you feel strongly
about this, I can merge the system WAL Roller into the one for hbase:meta .
bq. Does that mean .meta is not a sys table?
No. That is not the case. Keeping .meta WAL is mostly for backward
compatibility.
bq. No tests?
Let me try to add some test(s) in the next patch.
> Implement dedicated WAL for system tables
> -----------------------------------------
>
> Key: HBASE-14623
> URL: https://issues.apache.org/jira/browse/HBASE-14623
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Reporter: Ted Yu
> Assignee: Ted Yu
> Labels: wal
> Fix For: 2.0.0
>
> Attachments: 14623-v1.txt, 14623-v2.txt, 14623-v2.txt, 14623-v2.txt,
> 14623-v2.txt, 14623-v3.txt, 14623-v4.txt
>
>
> As Stephen suggested in parent JIRA, dedicating separate WAL for system
> tables (other than hbase:meta) should be done in new JIRA.
> This task is to fulfill the system WAL separation.
> Below is summary of discussion:
> For system table to have its own WAL, we would recover system table faster
> (fast log split, fast log replay). It would probably benefit
> AssignmentManager on system table region assignment. At this time, the new
> AssignmentManager is not planned to change WAL. So the existence of this JIRA
> is good for overall system, not specific to AssignmentManager.
> There are 3 strategies for implementing system table WAL:
> 1. one WAL for all non-meta system tables
> 2. one WAL for each non-meta system table
> 3. one WAL for each region of non-meta system table
> Currently most system tables are one region table (only ACL table may become
> big). Choices 2 and 3 basically are the same.
> From implementation point of view, choices 2 and 3 are cleaner than choice 1
> (as we have already had 1 WAL for META table and we can reuse the logic).
> With choice 2 or 3, assignment manager performance should not be impacted and
> it would be easier for assignment manager to assign system table region (eg.
> without waiting for user table log split to complete for assigning system
> table region).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)