I see the master is waiting and I see the exceptions but there's no context and the timestamps don't match (what happened in the region server at the time the second master took over), can you explain exactly what was done to get in that state? Also could you please tell use which hbase version you're using?
J-D On Thu, Feb 24, 2011 at 11:04 PM, Gaojinchao <[email protected]> wrote: > The zookeeper'sZnode had update to the new hmaster node, but the region > server also connect to the old one. > Do I need change any configure for hbase? > > Hmaster'log: > 2011-02-25 14:14:42,513 INFO org.apache.hadoop.http.HttpServer: Port returned > by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening > the listener on 60010 > 2011-02-25 14:14:42,513 INFO org.apache.hadoop.http.HttpServer: > listener.getLocalPort() returned 60010 > webServer.getConnectors()[0].getLocalPort() returned 60010 > 2011-02-25 14:14:42,513 INFO org.apache.hadoop.http.HttpServer: Jetty bound > to port 60010 > 2011-02-25 14:14:42,514 INFO org.mortbay.log: jetty-6.1.26 > 2011-02-25 14:14:42,743 INFO org.mortbay.log: Started > [email protected]:60010 > 2011-02-25 14:14:42,744 DEBUG org.apache.hadoop.hbase.master.HMaster: Started > service threads > 2011-02-25 14:14:44,244 INFO org.apache.hadoop.hbase.master.ServerManager: > Waiting on regionserver(s) to checkin > 2011-02-25 14:14:45,744 INFO org.apache.hadoop.hbase.master.ServerManager: > Waiting on regionserver(s) to checkin > 2011-02-25 14:14:47,245 INFO org.apache.hadoop.hbase.master.ServerManager: > Waiting on regionserver(s) to checkin > 2011-02-25 14:14:48,745 INFO org.apache.hadoop.hbase.master.ServerManager: > Waiting on regionserver(s) to checkin > > > Region server's log: > > 2011-02-25 13:22:38,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: > loaded > hdfs://C4C1:9000/hbase/ufdr5/109de857846354947e0664b2edd5f544/value/6813060688172507178, > isReference=false, isBulkLoadResult=false, seqid=60129691, > majorCompaction=true > 2011-02-25 13:22:38,138 WARN org.apache.hadoop.hbase.regionserver.Store: > Failed open of > hdfs://C4C1:9000/hbase/ufdr5/1a0f0d2085977a1dbdf9e9899013deef/value/5446694695136514012.6231cf1c5312c87f5ff55b64b3109491; > presumption is that file was corrupted at flush and lost edits picked up by > commit log replay. Verify! > java.io.IOException: Cannot open filename > /hbase/ufdr5/6231cf1c5312c87f5ff55b64b3109491/value/5446694695136514012 > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1493) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1484) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:380) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:398) > at > org.apache.hadoop.hbase.io.hfile.HFile$Reader.<init>(HFile.java:748) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:899) > at > org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:65) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:375) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438) > at > org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:266) > at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:208) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1963) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:343) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2506) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2492) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:262) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:94) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > 2011-02-25 13:22:38,138 INFO org.apache.hadoop.hbase.regionserver.HRegion: > Onlined > ufdr5,8613815537639#2927,1298400219634.109de857846354947e0664b2edd5f544.; > next sequenceid=60129692 > 2011-02-25 13:22:38,139 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > regionserver:60020-0x12e5a47a4e4000a Attempting to transition node > 109de857846354947e0664b2edd5f544 from RS_ZK_REGION_OPENING to > RS_ZK_REGION_OPENING > 2011-02-25 13:22:38,139 WARN org.apache.hadoop.hbase.regionserver.Store: > Failed open of > hdfs://C4C1:9000/hbase/ufdr5/86ac13dbf1e0287a9d6f5dc6caf452c8/value/3876832662736680001.9fe3127004d2ce5351da083fa2b824b5; > presumption is that file was corrupted at flush and lost edits picked up by > commit log replay. Verify! > java.io.IOException: Cannot open filename > /hbase/ufdr5/9fe3127004d2ce5351da083fa2b824b5/value/3876832662736680001 > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1493) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1484) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:380) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:398) > at > org.apache.hadoop.hbase.io.hfile.HFile$Reader.<init>(HFile.java:748) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:899) > at > org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:65) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:375) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438) > at > org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:266) > at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:208) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1963) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:343) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2506) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2492) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:262) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:94) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > 2011-02-25 13:22:38,150 WARN org.apache.hadoop.hbase.regionserver.Store: > Failed open of > hdfs://C4C1:9000/hbase/ufdr5/86ac13dbf1e0287a9d6f5dc6caf452c8/value/7758939538437314246.9fe3127004d2ce5351da083fa2b824b5; > presumption is that file was corrupted at flush and lost edits picked up by > commit log replay. Verify! > java.io.IOException: Cannot open filename > /hbase/ufdr5/9fe3127004d2ce5351da083fa2b824b5/value/7758939538437314246 > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1493) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1484) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:380) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:398) > at > org.apache.hadoop.hbase.io.hfile.HFile$Reader.<init>(HFile.java:748) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:899) > at > org.apache.hadoop.hbase.io.HalfStoreFileReader.<init>(HalfStoreFileReader.java:65) > at > org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:375) > at > org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:438) > at > org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:266) > at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:208) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1963) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:343) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2506) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2492) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:262) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:94) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) >
