[
https://issues.apache.org/jira/browse/HBASE-11094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeffrey Zhong updated HBASE-11094:
----------------------------------
Attachment: hbase-11094-v4.patch
Thanks [~enis] for the review! The v4 patch covers Enis comments. Basically, we
only keep log splitting/replay config setting in master(splitlogmanager) and
region server will use what's told by master.
When log recovery mode config setting is changed and restart master, master
will delay its recovery mode update till all outstanding log split tasks
drained in order to avoid asking people manually check the drain
status(basically no release note for the JIRA)
Thanks
> Distributed log replay is incompatible for rolling restarts
> -----------------------------------------------------------
>
> Key: HBASE-11094
> URL: https://issues.apache.org/jira/browse/HBASE-11094
> Project: HBase
> Issue Type: Sub-task
> Reporter: Enis Soztutar
> Assignee: Jeffrey Zhong
> Priority: Blocker
> Fix For: 0.99.0
>
> Attachments: hbase-11094-v2.patch, hbase-11094-v3.patch,
> hbase-11094-v4.patch, hbase-11094.patch
>
>
> 0.99.0 comes with dist log replay by default (HBASE-10888). However, reading
> the code and discussing this with Jeffrey, we realized that the dist log
> replay code is not compatible with rolling upgrades from 0.98.0 and 1.0.0.
> The issue is that, the region server looks at it own configuration to decide
> whether the region should be opened in replay mode or not. The open region
> RPC does not contain that info. So if dist log replay is enabled on master,
> the master will assign the region and schedule replay tasks. If the region is
> opened in a RS that does not have this conf enabled, then it will happily
> open the region in normal mode (not replay mode) causing possible (transient)
> data loss.
--
This message was sent by Atlassian JIRA
(v6.2#6252)