NPE in FSNamesystem when in safe mode
-------------------------------------

                 Key: HDFS-2838
                 URL: https://issues.apache.org/jira/browse/HDFS-2838
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: HA branch (HDFS-1623)
            Reporter: Gregory Chanan


I'm seeing an NPE when running HBase 0.92 unit tests against the HA branch.  
The test failure is: 
org.apache.hadoop.hbase.regionserver.wal.TestHLog.testAppendClose.

Here is the backtrace:
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:179)
        at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getActiveBlockCount(BlockManager.java:2465)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.doConsistencyCheck(FSNamesystem.java:3591)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3285)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$900(FSNamesystem.java:3196)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3670)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.isInSafeMode(NameNode.java:609)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:1476)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:1487)

Here is the relevant section of the test:

{code}
   try {
      DistributedFileSystem dfs = (DistributedFileSystem) 
cluster.getFileSystem();
      dfs.setSafeMode(FSConstants.SafeModeAction.SAFEMODE_ENTER);
      cluster.shutdown();
      try {
        // wal.writer.close() will throw an exception,
        // but still call this since it closes the LogSyncer thread first
        wal.close();
      } catch (IOException e) {
        LOG.info(e);
      }
      fs.close(); // closing FS last so DFSOutputStream can't call close
      LOG.info("STOPPED first instance of the cluster");
    } finally {
      // Restart the cluster
      while (cluster.isClusterUp()){
        LOG.error("Waiting for cluster to go down");
        Thread.sleep(1000);
      }
{code}

Fix looks trivial, will include patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to