[
https://issues.apache.org/jira/browse/HBASE-22628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pankaj Kumar updated HBASE-22628:
---------------------------------
Description:
Custom WAL directory usage must be documented, otherwise it may lead to
inconsistent data during migrating to new WAL dir path.
You can consider below scenario while migrating to custom WAL directory.
# Setup HBase cluster with the default setting (all WAL files are under the
root directory ie. /hbase/WALs).
# Create table 't1' and insert few records
# Flush meta table (so that table region entries persist in FS)
# Forcibly kill HBase processes (HM & RS).
# Configure the hbase.wal.dir to outside the root dir (say /hbaseWAL)
# Start the HBase servers
# Scan 't1'
Ideally HMaster should submit split task of old RS(s) WAL files (created under
/hbase/WALs) and old data should be replayed. But currently, during HM startup
we populate the previous dead servers from the current WAL dir ( hbase.wal.dir
-> /hbaseWAL).
Since WAL dir path is new, so you need to copy RegionServer WAL directories
manualy to new path.
was:
There is one data loss scenario while migrating to custom WAL directory.
Steps to reproduce:
# Setup HBase cluster with the default setting (all WAL files are under the
root directory ie. /hbase/WALs).
# Create table 't1' and insert few records
# Flush meta table (so that table region entries persist in FS)
# Forcibly kill HBase processes (HM & RS).
# Configure the hbase.wal.dir to outside the root dir (say /hbaseWAL)
# Start the HBase servers
# Scan 't1'
Ideally HMaster should submit split task of old RS(s) WAL files (created under
/hbase/WALs) and old data should be replayed.
But currently, during HM startup we populate the previous dead servers from the
current WAL dir ( hbase.wal.dir -> /hbaseWAL).
In MasterFileSystem.getFailedServersFromLogFolders(),
{code:java}
Set<ServerName> getFailedServersFromLogFolders() {
boolean retrySplitting = !conf.getBoolean("hbase.hlog.split.skip.errors",
WALSplitter.SPLIT_SKIP_ERRORS_DEFAULT);
Set<ServerName> serverNames = new HashSet<ServerName>();
Path logsDirPath = new Path(this.walRootDir, HConstants.HREGION_LOGDIR_NAME);
do {
if (master.isStopped()) {
LOG.warn("Master stopped while trying to get failed servers.");
break;
}
try {
if (!this.walFs.exists(logsDirPath)) return serverNames;
FileStatus[] logFolders = FSUtils.listStatus(this.walFs, logsDirPath, null);
{code}
For backward compatibility we should consider default WAL directory path also.
> Document the custom WAL directory (hbase.wal.dir) usage
> -------------------------------------------------------
>
> Key: HBASE-22628
> URL: https://issues.apache.org/jira/browse/HBASE-22628
> Project: HBase
> Issue Type: Bug
> Components: documentation, wal
> Reporter: Pankaj Kumar
> Assignee: Pankaj Kumar
> Priority: Critical
>
> Custom WAL directory usage must be documented, otherwise it may lead to
> inconsistent data during migrating to new WAL dir path.
>
> You can consider below scenario while migrating to custom WAL directory.
> # Setup HBase cluster with the default setting (all WAL files are under the
> root directory ie. /hbase/WALs).
> # Create table 't1' and insert few records
> # Flush meta table (so that table region entries persist in FS)
> # Forcibly kill HBase processes (HM & RS).
> # Configure the hbase.wal.dir to outside the root dir (say /hbaseWAL)
> # Start the HBase servers
> # Scan 't1'
> Ideally HMaster should submit split task of old RS(s) WAL files (created
> under /hbase/WALs) and old data should be replayed. But currently, during HM
> startup we populate the previous dead servers from the current WAL dir (
> hbase.wal.dir -> /hbaseWAL).
>
> Since WAL dir path is new, so you need to copy RegionServer WAL directories
> manualy to new path.
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)