ProcessServerShutdown throws NullPointerException for offline regions ---------------------------------------------------------------------
Key: HBASE-2497 URL: https://issues.apache.org/jira/browse/HBASE-2497 Project: Hadoop HBase Issue Type: Bug Components: master Affects Versions: 0.20.3 Reporter: Miklos Kurucz When a regionsserver dies the master can run into the following bug. 2010-04-27 17:20:37,303 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessServerShutdown of dell106.cluster,60020,1272377612991 2010-04-27 17:20:37,303 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of server dell106.cluster,60020,1272377612991: logSplit: true, rootRescanned: true, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1 2010-04-27 17:20:01,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete, meta reassignment and scanning: 2010-04-27 17:20:01,653 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanRootRegion: process server shutdown scanning root region on 10.1.3.124 2010-04-27 17:20:01,664 DEBUG org.apache.hadoop.hbase.master.RegionServerOperation: process server shutdown scanning root region on 10.1.3.124 finished master 2010-04-27 17:20:01,683 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions: process server shutdown scanning .META.,,1 on 10.1.3.104:60020 2010-04-27 17:20:18,087 DEBUG org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions: Exception in RetryableMetaOperation: 2010-04-27 17:20:18,118 WARN org.apache.hadoop.hbase.master.HMaster: Adding to delayed queue: ProcessServerShutdown of dell106.cluster,60020,1272377612991 java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.master.RetryableMetaOperation.doWithRetries(RetryableMetaOperation.java:100) at org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:345) at org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:509) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:448) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:487) at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:461) at org.apache.hadoop.hbase.master.ProcessServerShutdown.scanMetaRegion(ProcessServerShutdown.java:147) at org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions.call(ProcessServerShutdown.java:264) at org.apache.hadoop.hbase.master.ProcessServerShutdown$ScanMetaRegions.call(ProcessServerShutdown.java:250) at org.apache.hadoop.hbase.master.RetryableMetaOperation.doWithRetries(RetryableMetaOperation.java:69) ... 3 more The problem is in ProcessServerShutdown.java at line 148-149: 146 String serverAddress = 147 Bytes.toString(values.getValue(CATALOG_FAMILY, SERVER_QUALIFIER)); 148 long startCode = 149 Bytes.toLong(values.getValue(CATALOG_FAMILY, STARTCODE_QUALIFIER)); 150 String serverName = null; 151 if (serverAddress != null && serverAddress.length() > 0) { 152 serverName = HServerInfo.getServerName(serverAddress, startCode); 153 } It should be modified to: 146 String serverAddress = 147 Bytes.toString(values.getValue(CATALOG_FAMILY, SERVER_QUALIFIER)); 150 String serverName = null; 151 if (serverAddress != null && serverAddress.length() > 0) { 148 long startCode = 149 Bytes.toLong(values.getValue(CATALOG_FAMILY, STARTCODE_QUALIFIER)); 152 serverName = HServerInfo.getServerName(serverAddress, startCode); 153 } As Bytes.toLong cannot handle the null pointer returned by getValue for missing STARTCODE_QUALIFIER of offline regions in META. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.