The below is hard to read but I think I follow you Gao.  Makes sense.
Do you have a patch?  (How did you manufacture the below?  Was it by
accident?  Its a good one).

St.Ack

On Fri, May 27, 2011 at 7:04 PM, Gaojinchao <[email protected]> wrote:
> issue:
> NullPointerException while hmaster starting.
>       java.lang.NullPointerException
>         at java.util.TreeMap.getEntry(TreeMap.java:324)
>         at java.util.TreeMap.get(TreeMap.java:255)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.addToServers(AssignmentManager.java:1512)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:606)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:214)
>         at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:402)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)
>
> // below is my analysis
> private void finishInitialization()
> throws IOException, InterruptedException, KeeperException
> {
>
>    /*
>        In the function, waitForRegionServers return when some regionservers 
> checkin.
>        The regionservers include some regions, but not root-region and 
> meta-region.
>        (the Registered servers are 157-5-111-13, 157-5-111-14, 157-5-111-12)
>    */
>    int regionCount = his.serverManager.waitForRegionServers();
>
>    /*
>        In the function, we can know 
> verifyRootRegionLocation/verifyMetaRegionLocation are succeed with the log of 
> assigned=0,
>        and the root-regionserver and meta-regionserver are running.
>        (The 157-5-111-11 server is running, but not Registered. and server 
> include root-region and meta-region.)
>    */
>    assignRootAndMeta();
>
>    if (regionCount == 0) {
>      LOG.info("Master startup proceeding: cluster startup");
>      this.assignmentManager.cleanoutUnassigned();
>      this.assignmentManager.assignAllUserRegions();
>    } else {
>      LOG.info("Master startup proceeding: master failover");
>
>      /*
>        HServerInfo hsi = 
> this.serverManager.getHServerInfo(this.catalogTracker.getMetaLocation());
>        In processFailover function, call getHServerInfo return NULL pointer, 
> hsi is NULL pointer.
>        Because the regionserver with meta-region has not Registered.
>      */
>      this.assignmentManager.processFailover();
>    }
> }
>
> the exception logs:
>
> 2011-05-21 14:44:47,973 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Waiting on regionserver(s) to checkin
> 2011-05-21 14:44:49,473 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Waiting on regionserver(s) to checkin
> 2011-05-21 14:44:50,974 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Waiting on regionserver(s) to checkin
> 2011-05-21 14:44:51,281 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Registering server=157-5-111-13,20020,1305877624933, regionCount=1543, 
> userLoad=true
> 2011-05-21 14:44:51,722 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Registering server=157-5-111-14,20020,1305877627727, regionCount=1507, 
> userLoad=true
> 2011-05-21 14:44:51,774 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Registering server=157-5-111-12,20020,1305877626108, regionCount=1521, 
> userLoad=true
> 2011-05-21 14:44:52,474 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Waiting on regionserver(s) count to settle; currently=3
> 2011-05-21 14:44:53,974 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Finished waiting for regionserver count to settle; count=3, sleptFor=18000
> 2011-05-21 14:44:53,974 INFO org.apache.hadoop.hbase.master.ServerManager: 
> Exiting wait on regionserver(s) to checkin; count=3, stopped=false, count of 
> regions out on cluster=4571
> 2011-05-21 14:44:53,978 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161 doesn't 
> belong to a known region server, splitting
> 2011-05-21 14:44:53,991 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting 2 hlog(s) in 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161
> 2011-05-21 14:44:53,992 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-0,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-2,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-3,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-1,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-6,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-5,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-4,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-8,5,main]: starting
> 2011-05-21 14:44:53,993 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-7,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-9,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-11,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-10,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-12,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-13,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-14,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer thread 
> Thread[WriterThread-15,5,main]: starting
> 2011-05-21 14:44:53,994 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog 1 of 2: 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305875949857,
>  length=1503902
> 2011-05-21 14:44:53,996 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering 
> file 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305875949857
> 2011-05-21 14:44:54,998 INFO org.apache.hadoop.hbase.util.FSUtils: Finished 
> lease recover attempt for 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305875949857
> 2011-05-21 14:44:55,140 INFO 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs 
> -- HDFS-200
> 2011-05-21 14:44:55,141 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Creating writer 
> path=hdfs://157.5.111.10:9000/hbase/-ROOT-/70236052/recovered.edits/0000000000000021721
>  region=70236052
> 2011-05-21 14:44:55,177 INFO 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs 
> -- HDFS-200
> 2011-05-21 14:44:55,177 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Creating writer 
> path=hdfs://157.5.111.10:9000/hbase/.META./1028785192/recovered.edits/0000000000000021722
>  region=1028785192
> 2011-05-21 14:44:55,435 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Pushed=5400 entries 
> from 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305875949857
> 2011-05-21 14:44:55,436 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog 2 of 2: 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305879556259,
>  length=0
> 2011-05-21 14:44:55,436 INFO org.apache.hadoop.hbase.util.FSUtils: Recovering 
> file 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305879556259
> 2011-05-21 14:44:56,439 INFO org.apache.hadoop.hbase.util.FSUtils: Finished 
> lease recover attempt for 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305879556259
> 2011-05-21 14:44:56,439 WARN 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: File 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305879556259
>  might be still open, length is 0
> 2011-05-21 14:44:56,451 DEBUG 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Pushed=0 entries from 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305879556259
> 2011-05-21 14:44:56,489 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Archived processed log 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305875949857
>  to hdfs://157.5.111.10:9000/hbase/.oldlogs/157-5-111-11%3A20020.1305875949857
> 2011-05-21 14:44:56,506 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Archived processed log 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161/157-5-111-11%3A20020.1305879556259
>  to hdfs://157.5.111.10:9000/hbase/.oldlogs/157-5-111-11%3A20020.1305879556259
> 2011-05-21 14:44:56,523 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Waiting for split 
> writer threads to finish
> 2011-05-21 14:44:57,438 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Split writers finished
> 2011-05-21 14:44:57,455 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path 
> hdfs://157.5.111.10:9000/hbase/.META./1028785192/recovered.edits/0000000000000021722
>  (wrote 5399 edits in 322ms)
> 2011-05-21 14:44:57,505 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path 
> hdfs://157.5.111.10:9000/hbase/-ROOT-/70236052/recovered.edits/0000000000000021721
>  (wrote 1 edits in 106ms)
> 2011-05-21 14:44:57,505 INFO 
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: hlog file splitting 
> completed in 3517 ms for 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-11,20020,1305875930161
> 2011-05-21 14:44:57,506 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-12,20020,1305877626108 belongs 
> to an existing region server
> 2011-05-21 14:44:57,506 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-13,20020,1305877624933 belongs 
> to an existing region server
> 2011-05-21 14:44:57,506 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://157.5.111.10:9000/hbase/.logs/157-5-111-14,20020,1305877627727 belongs 
> to an existing region server
> 2011-05-21 14:44:57,524 INFO org.apache.hadoop.hbase.master.HMaster: -ROOT- 
> assigned=0, rit=false, location=157-5-111-11:20020
> 2011-05-21 14:44:57,528 INFO org.apache.hadoop.hbase.master.HMaster: .META. 
> assigned=0, rit=false, location=157-5-111-11:20020
> 2011-05-21 14:44:57,528 INFO org.apache.hadoop.hbase.master.HMaster: Master 
> startup proceeding: master failover
> 2011-05-21 14:44:57,530 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> java.lang.NullPointerException
>         at java.util.TreeMap.getEntry(TreeMap.java:324)
>         at java.util.TreeMap.get(TreeMap.java:255)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.addToServers(AssignmentManager.java:1512)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:606)
>         at 
> org.apache.hadoop.hbase.master.AssignmentManager.processFailover(AssignmentManager.java:214)
>         at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:402)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)
> 2011-05-21 14:44:57,531 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2011-05-21 14:44:57,531 DEBUG org.apache.hadoop.hbase.master.HMaster: 
> Stopping service threads
>

Reply via email to