Yizhou Wang created HBASE-27447:
-----------------------------------

             Summary: hlog loss result in data loss when regionserver down 
network card
                 Key: HBASE-27447
                 URL: https://issues.apache.org/jira/browse/HBASE-27447
             Project: HBase
          Issue Type: Bug
          Components: master, wal
    Affects Versions: 2.2.7
            Reporter: Yizhou Wang


When tested hbase replication, I found that the data in the memory was lost. 
Through the source code of hbase, I found that after hbase split the meta 
table, it will delete the entire wal  directory , but hlog is still in the same 
directory, and which save the data in the memstore. Eventually cause data loss.

The specific process is as follows:
 #  put a few data into hbase table.
 #  turn down the network card of all regionserver nodes.
 #  turn up the network card of all regionserver nodes,regionserver will be 
killed.
 #  restart the hbase cluster, scan table and find no data in the table.

In hmaster log will print:

      master.MasterWalManager: Log dir for server xxx does not exist

 

The splitLogDistributed function in SplitLogManager.java caused this issue. 
hmaster will first call splitLogDistributed function to publish the split task 
of the meta table. After the meta split task is completed, the directory will 
be deleted. Then hmaster want to publish another normal split task to restore 
the data again, but no wal directory was found.
{code:java}
    waitForSplittingCompletion(batch, status);
      ...
    for (Path logDir : logDirs) {
      status.setStatus("Cleaning up log directory...");
      final FileSystem fs = logDir.getFileSystem(conf);
      try {
        if (fs.exists(logDir) && !fs.delete(logDir, false)) {
          LOG.warn("Unable to delete log src dir. Ignoring. " + logDir);
        }
      } catch (IOException ioe) {
        FileStatus[] files = fs.listStatus(logDir);
        if (files != null && files.length > 0) {
          LOG.warn("Returning success without actually splitting and "
              + "deleting all the log files in path " + logDir + ": "
              + Arrays.toString(files), ioe);
        } else {
          LOG.warn("Unable to delete log src dir. Ignoring. " + logDir, ioe);
        }
      }
    }{code}
   My English is poor, so if there is anything unclear, can leave me a message.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to