Under the current hdfs Version, there's no related method to judge whether the 
namenode is in safemode.
Maybe we can handle the SafeModeException at the top layer where called the 
method of checkFileSystem(),  wait for a while and retry the operation.
Does that reasonable? I hope someone can give some advises.

Thanks!
Jieshan Bean

-----邮件原件-----
发件人: [email protected] [mailto:[email protected]] 代表 Stack
发送时间: 2011年4月24日 5:55
收件人: [email protected]
抄送: Chenjian
主题: Re: Splitlog() executed while the namenode was in safemode may cause 
data-loss

Sorry, what did you change?
Thanks,
St.Ack

On Fri, Apr 22, 2011 at 9:00 PM, bijieshan <[email protected]> wrote:
> Hi,
> I found this problem while the namenode went into safemode due to some 
> unclear reasons.
> There's one patch about this problem:
>
>   try {
>      HLogSplitter splitter = HLogSplitter.createLogSplitter(
>        conf, rootdir, logDir, oldLogDir, this.fs);
>      try {
>        splitter.splitLog();
>      } catch (OrphanHLogAfterSplitException e) {
>        LOG.warn("Retrying splitting because of:", e);
>        // An HLogSplitter instance can only be used once.  Get new instance.
>        splitter = HLogSplitter.createLogSplitter(conf, rootdir, logDir,
>          oldLogDir, this.fs);
>        splitter.splitLog();
>      }
>      splitTime = splitter.getTime();
>      splitLogSize = splitter.getSize();
>    } catch (IOException e) {
>      checkFileSystem();
>      LOG.error("Failed splitting " + logDir.toString(), e);
>      master.abort("Shutting down HBase cluster: Failed splitting hlog 
> files...", e);
>    } finally {
>      this.splitLogLock.unlock();
>    }
>
> And it was really give some useful help to some extent, while the namenode 
> process exited or been killed, but not considered the Namenode safemode 
> exception.
>   I think the root reason is the method of checkFileSystem().
>   It gives out an method to check whether the HDFS works normally(Read and 
> write could be success), and that maybe the original propose of this method. 
> This's how this method implements:
>
>    DistributedFileSystem dfs = (DistributedFileSystem) fs;
>    try {
>      if (dfs.exists(new Path("/"))) {
>        return;
>      }
>    } catch (IOException e) {
>      exception = RemoteExceptionHandler.checkIOException(e);
>    }
>
>   I have check the hdfs code, and learned that while the namenode was in 
> safemode ,the dfs.exists(new Path("/")) returned true. Because the file 
> system could provide read-only service. So this method just checks the dfs 
> whether could be read. I
> think it's not reasonable.
>
>
> Regards,
> Jieshan Bean
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply via email to