[
https://issues.apache.org/jira/browse/HBASE-22628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872685#comment-16872685
]
Zach York commented on HBASE-22628:
-----------------------------------
When the custom WAL directory was added, it was assumed to be a backwards
incompatible change needing a clean shutdown before. However, maybe it is time
to add some backwards compatibility?
We are never going to be able to handle every case here since that would
require knowing what the wal Dir had been set to previously, but maybe it would
be enough to add a check to the default location.
The other option is to create a separate migration tool.
> Data loss while migrating to custom WAL directory (hbase.wal.dir)
> -----------------------------------------------------------------
>
> Key: HBASE-22628
> URL: https://issues.apache.org/jira/browse/HBASE-22628
> Project: HBase
> Issue Type: Bug
> Components: Recovery, wal
> Reporter: Pankaj Kumar
> Assignee: Pankaj Kumar
> Priority: Blocker
>
> There is one data loss scenario while migrating to custom WAL directory.
> Steps to reproduce:
> # Setup HBase cluster with the default setting (all WAL files are under the
> root directory ie. /hbase/WALs).
> # Create table 't1' and insert few records
> # Flush meta table (so that table region entries persist in FS)
> # Forcibly kill HBase processes (HM & RS).
> # Configure the hbase.wal.dir to outside the root dir (say /hbaseWAL)
> # Start the HBase servers
> # Scan 't1'
> Ideally HMaster should submit split task of old RS(s) WAL files (created
> under /hbase/WALs) and old data should be replayed.
> But currently, during HM startup we populate the previous dead servers from
> the current WAL dir ( hbase.wal.dir -> /hbaseWAL).
> In MasterFileSystem.getFailedServersFromLogFolders(),
> {code:java}
> Set<ServerName> getFailedServersFromLogFolders() {
> boolean retrySplitting = !conf.getBoolean("hbase.hlog.split.skip.errors",
> WALSplitter.SPLIT_SKIP_ERRORS_DEFAULT);
> Set<ServerName> serverNames = new HashSet<ServerName>();
> Path logsDirPath = new Path(this.walRootDir, HConstants.HREGION_LOGDIR_NAME);
> do {
> if (master.isStopped()) {
> LOG.warn("Master stopped while trying to get failed servers.");
> break;
> }
> try {
> if (!this.walFs.exists(logsDirPath)) return serverNames;
> FileStatus[] logFolders = FSUtils.listStatus(this.walFs, logsDirPath, null);
> {code}
> For backward compatibility we should consider default WAL directory path also.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)