For sure you are removing the hbase dir in hdfs? Try major compaction of your .META. table?
hbase> major_compact ".META." You seem to be suffering HBASE-1880 but if you are removing the hbase dir, you shouldn't be running into this. St.Ack On Tue, Jun 29, 2010 at 9:26 AM, Stanislaw Kogut <[email protected]> wrote: > Yes, I doing hadoop fs -rmr /hbase each time. > > So, here is some messages from master logs: > > 2010-06-29 19:15:11,309 INFO org.apache.hadoop.hbase.master.ServerManager: 5 > region servers, 0 dead, average load 0.8 > 2010-06-29 19:15:12,146 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scanning meta region {server: > 192.168.242.144:60020, regionname: -ROOT-,,0, startKey: <>} > 2010-06-29 19:15:12,151 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > Current assignment of .META.,,1 is not valid; serverAddress= > 192.168.242.146:60020, startCode=1277827873580 unknown. > 2010-06-29 19:15:12,152 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scan of 1 row(s) of meta region {server: > 192.168.242.144:60020, regionname: -ROOT-,,0, startKey: <>} complete > 2010-06-29 19:15:12,304 DEBUG org.apache.hadoop.hbase.master.RegionManager: > Assigning for address: 192.168.242.144:60020, startcode: 1277827873598, > load: (requests=5, regions=1, usedHeap=33, maxHeap=1996): total nregions to > assign=0, regions to give other servers than this=1, isMetaAssign=true > 2010-06-29 19:15:12,304 DEBUG org.apache.hadoop.hbase.master.RegionManager: > Assigning address: 192.168.242.144:60020, startcode: 1277827873598, load: > (requests=5, regions=1, usedHeap=33, maxHeap=1996) 0 regions > 2010-06-29 19:15:12,304 INFO org.apache.hadoop.hbase.master.RegionManager: > Assigning region .META.,,1 to uasstse004.ua.sistyma.com.,60020,1277827873598 > 2010-06-29 19:15:12,329 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_OPEN: .META.,,1 from > uasstse004.ua.sistyma.com.,60020,1277827873598; > 1 of 1 > 2010-06-29 19:15:12,329 DEBUG org.apache.hadoop.hbase.master.HMaster: > Processing todo: PendingOpenOperation from uasstse004.ua.sistyma.com > .,60020,1277827873598 > 2010-06-29 19:15:12,329 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: .META.,,1 open on > 192.168.242.144:60020 > 2010-06-29 19:15:12,331 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: Updated row .META.,,1 > in region -ROOT-,,0 with startcode=1277827873598, server= > 192.168.242.144:60020 > 2010-06-29 19:15:12,331 DEBUG > org.apache.hadoop.hbase.master.RegionServerOperation: Adding to > onlineMetaRegions: {server: 192.168.242.144:60020, regionname: .META.,,1, > startKey: <>} > 2010-06-29 19:15:12,331 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scanning meta region {server: > 192.168.242.144:60020, regionname: .META.,,1, startKey: <>} > 2010-06-29 19:15:12,333 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scan of 0 row(s) of meta region {server: > 192.168.242.144:60020, regionname: .META.,,1, startKey: <>} complete > 2010-06-29 19:15:12,333 INFO org.apache.hadoop.hbase.master.BaseScanner: All > 1 .META. region(s) scanned > > then, after some time of inactivity (no regionserver failures, no > stop/starts, nothing) > > 2010-06-29 19:16:11,317 INFO org.apache.hadoop.hbase.master.ServerManager: 5 > region servers, 0 dead, average load 1.0 > 2010-06-29 19:16:12,153 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scanning meta region {server: > 192.168.242.144:60020, regionname: -ROOT-,,0, startKey: <>} > 2010-06-29 19:16:12,162 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > Current assignment of .META.,,1 is not valid; serverAddress= > 192.168.242.144:60020, startCode=1277827873598 unknown. > 2010-06-29 19:16:12,164 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scan of 1 row(s) of meta region {server: > 192.168.242.144:60020, regionname: -ROOT-,,0, startKey: <>} complete > 2010-06-29 19:16:12,303 DEBUG org.apache.hadoop.hbase.master.RegionManager: > Assigning for address: 192.168.242.142:60020, startcode: 1277827873624, > load: (requests=0, regions=0, usedHeap=30, maxHeap=1996): total nregions to > assign=1, regions to give other servers than this=0, isMetaAssign=true > 2010-06-29 19:16:12,303 DEBUG org.apache.hadoop.hbase.master.RegionManager: > Assigning address: 192.168.242.142:60020, startcode: 1277827873624, load: > (requests=0, regions=0, usedHeap=30, maxHeap=1996) 1 regions > 2010-06-29 19:16:12,303 INFO org.apache.hadoop.hbase.master.RegionManager: > Assigning region .META.,,1 to uasstse002.ua.sistyma.com.,60020,1277827873624 > 2010-06-29 19:16:12,340 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scanning meta region {server: > 192.168.242.144:60020, regionname: .META.,,1, startKey: <>} > 2010-06-29 19:16:12,342 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scan of 0 row(s) of meta region {server: > 192.168.242.144:60020, regionname: .META.,,1, startKey: <>} complete > 2010-06-29 19:16:12,342 INFO org.apache.hadoop.hbase.master.BaseScanner: All > 1 .META. region(s) scanned > 2010-06-29 19:16:12,369 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_OPEN: .META.,,1 from > uasstse002.ua.sistyma.com.,60020,1277827873624; > 1 of 1 > 2010-06-29 19:16:12,369 DEBUG org.apache.hadoop.hbase.master.HMaster: > Processing todo: PendingOpenOperation from uasstse002.ua.sistyma.com > .,60020,1277827873624 > 2010-06-29 19:16:12,369 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: .META.,,1 open on > 192.168.242.142:60020 > 2010-06-29 19:16:12,370 INFO > org.apache.hadoop.hbase.master.RegionServerOperation: Updated row .META.,,1 > in region -ROOT-,,0 with startcode=1277827873624, server= > 192.168.242.142:60020 > 2010-06-29 19:16:12,370 DEBUG > org.apache.hadoop.hbase.master.RegionServerOperation: Adding to > onlineMetaRegions: {server: 192.168.242.142:60020, regionname: .META.,,1, > startKey: <>} > 2010-06-29 19:16:12,371 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scanning meta region {server: > 192.168.242.142:60020, regionname: .META.,,1, startKey: <>} > 2010-06-29 19:16:12,401 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scan of 0 row(s) of meta region {server: > 192.168.242.142:60020, regionname: .META.,,1, startKey: <>} complete > 2010-06-29 19:16:12,401 INFO org.apache.hadoop.hbase.master.BaseScanner: All > 1 .META. region(s) scanned > > and so on. > > Same operations for TestTable after it was created: > 2010-06-29 19:22:12,199 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > Current assignment of .META.,,1 is not valid; serverAddress= > 192.168.242.145:60020, startCode=1277827873614 unknown. > 2010-06-29 19:22:12,200 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.rootScanner scan of 1 row(s) of meta region {server: > 192.168.242.144:60020, regionname: -ROOT-,,0, startKey: <>} complete > 2010-06-29 19:22:12,216 INFO org.apache.hadoop.hbase.master.BaseScanner: > RegionManager.metaScanner scanning meta region {server: > 192.168.242.145:60020, regionname: .META.,,1, startKey: <>} > 2010-06-29 19:22:12,224 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > Current assignment of TestTable,,1277828486199 is not valid; serverAddress= > 192.168.242.142:60020, startCode=1277827873624 unknown. > 2010-06-29 19:22:12,227 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > Current assignment of TestTable,0000149568,1277828511285 is not valid; > serverAddress=192.168.242.142:60020, startCode=1277827873624 unknown. > 2010-06-29 19:22:12,230 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > Current assignment of TestTable,0000332096,1277828511285 is not valid; > serverAddress=192.168.242.142:60020, startCode=1277827873624 unknown. > 2010-06-29 19:22:12,233 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > TestTable,,1277828486199/1681284195 no longer has references to > TestTable,,1277828317329 > 2010-06-29 19:22:12,239 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > TestTable,0000149568,1277828486199/1743937205 no longer has references to > TestTable,,1277828317329 > 2010-06-29 19:22:12,239 INFO org.apache.hadoop.hbase.master.BaseScanner: > Deleting region TestTable,,1277828317329 (encoded=1736533012) because > daughter splits no longer hold references > 2010-06-29 19:22:12,240 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > DELETING region hdfs:// > uasstse002.ua.sistyma.com:8020/hbase/TestTable/1736533012 > 2010-06-29 19:22:12,266 DEBUG org.apache.hadoop.hbase.master.RegionManager: > Assigning for address: 192.168.242.146:60020, startcode: 1277827873580, > load: (requests=0, regions=2, usedHeap=34, maxHeap=1196): total nregions to > assign=4, regions to give other servers than this=0, isMetaAssign=true > 2010-06-29 19:22:12,266 DEBUG org.apache.hadoop.hbase.master.RegionManager: > Assigning address: 192.168.242.146:60020, startcode: 1277827873580, load: > (requests=0, regions=2, usedHeap=34, maxHeap=1196) 4 regions > 2010-06-29 19:22:12,266 INFO org.apache.hadoop.hbase.master.RegionManager: > Assigning region TestTable,,1277828486199 to uasstse006.ua.sistyma.com > .,60020,1277827873580 > 2010-06-29 19:22:12,266 INFO org.apache.hadoop.hbase.master.RegionManager: > Assigning region TestTable,0000149568,1277828511285 to > uasstse006.ua.sistyma.com.,60020,1277827873580 > 2010-06-29 19:22:12,266 INFO org.apache.hadoop.hbase.master.RegionManager: > Assigning region TestTable,0000332096,1277828511285 to > uasstse006.ua.sistyma.com.,60020,1277827873580 > 2010-06-29 19:22:12,266 INFO org.apache.hadoop.hbase.master.RegionManager: > Assigning region .META.,,1 to uasstse006.ua.sistyma.com.,60020,1277827873580 > 2010-06-29 19:22:12,287 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_OPEN: TestTable,,1277828486199 from > uasstse006.ua.sistyma.com.,60020,1277827873580; 1 of 3 > 2010-06-29 19:22:12,287 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_PROCESS_OPEN: TestTable,0000332096,1277828511285 from > uasstse006.ua.sistyma.com.,60020,1277827873580; 2 of 3 > 2010-06-29 19:22:12,287 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_PROCESS_OPEN: .META.,,1 from > uasstse006.ua.sistyma.com.,60020,1277827873580; > 3 of 3 > 2010-06-29 19:22:12,287 DEBUG org.apache.hadoop.hbase.master.HMaster: > Processing todo: PendingOpenOperation from uasstse006.ua.sistyma.com > .,60020,1277827873580 > 2010-06-29 19:22:12,288 DEBUG > org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: > 1, onlineMetaRegions.size(): 1 > 2010-06-29 19:22:12,288 DEBUG > org.apache.hadoop.hbase.master.RegionServerOperation: Requeuing because not > all meta regions are online > 2010-06-29 19:22:12,291 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > TestTable,0000149568,1277828511285/436162763 no longer has references to > TestTable,0000149568,1277828486199 > 2010-06-29 19:22:12,295 DEBUG org.apache.hadoop.hbase.master.BaseScanner: > TestTable,0000332096,1277828511285/315170953 no longer has references to > TestTable,0000149568,1277828486199 > 2010-06-29 19:22:12,295 INFO org.apache.hadoop.hbase.master.BaseScanner: > Deleting region TestTable,0000149568,1277828486199 (encoded=1743937205) > because daughter splits no longer hold references > 2010-06-29 19:22:12,297 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > DELETING region hdfs:// > uasstse002.ua.sistyma.com:8020/hbase/TestTable/1743937205 > 2010-06-29 19:22:12,310 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_OPEN: TestTable,0000149568,1277828511285 from > uasstse006.ua.sistyma.com.,60020,1277827873580; 1 of 2 > 2010-06-29 19:22:12,310 INFO org.apache.hadoop.hbase.master.ServerManager: > Processing MSG_REPORT_PROCESS_OPEN: .META.,,1 from > uasstse006.ua.sistyma.com.,60020,1277827873580; > 2 of 2 > 2010-06-29 19:22:12,310 DEBUG org.apache.hadoop.hbase.master.HMaster: > Processing todo: PendingOpenOperation from uasstse006.ua.sistyma.com > .,60020,1277827873580 > 2010-06-29 19:22:12,310 DEBUG > org.apache.hadoop.hbase.master.RegionServerOperation: numberOfMetaRegions: > 1, onlineMetaRegions.size(): 1 > 2010-06-29 19:22:12,310 DEBUG > org.apache.hadoop.hbase.master.RegionServerOperation: Requeuing because not > all meta regions are online > > > > On Tue, Jun 29, 2010 at 5:57 PM, Stack <[email protected]> wrote: > >> Is this a testing install? If so remove the hbase dir in hdfs and start >> over. >> >> Else on pe failure what does the master log say? >> >> In 0.20.5 we moved so some more messages show at info level which could >> explain some of the differences you are seeing? >> >> Stack >> >> >> >> On Jun 29, 2010, at 6:21 AM, Stanislaw Kogut <[email protected]> wrote: >> >> > Hi everyone! >> > >> > Has someone noticed same behaviour of hbase-0.20.5 after upgrade from >> > 0.20.3? >> > >> > $hadoop jar hbase/hbase-0.20.5-test.jar sequentialWrite 1 >> > 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client >> > environment:zookeeper.version=3.2.2-888565, built on 12/08/2009 21:51 GMT >> > 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client environment:host.name >> > =se002.cluster.local >> > 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client >> > environment:java.version=1.6.0_20 >> > 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client >> > environment:java.vendor=Sun Microsystems Inc. >> > 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client >> > environment:java.home=/usr/java/jdk1.6.0_20/jre >> > 10/06/29 16:03:21 INFO zookeeper.ZooKeeper: Client >> > >> environment:java.class.path=/opt/hadoop/common/bin/../conf:/usr/java/latest/lib/tools.jar:/opt/hadoop/common/bin/..:/opt/hadoop/common/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/common/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/common/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/common/bin/../lib/commons-el-1.0.jar:/opt/hadoop/common/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/common/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/common/bin/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/common/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/common/bin/../lib/core-3.1.1.jar:/opt/hadoop/common/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/common/bin/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/common/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/common/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/common/bin/../lib/jetty-6.1.14.jar:/opt/hadoop/common/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/common/bin/../lib/junit-3.8.1.jar:/opt/hadoop/common/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/common/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/common/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/common/bin/../lib/oro-2.0.8.jar:/opt/hadoop/common/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/common/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/common/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/common/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/common/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/common/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hadoop/hbase/lib/zookeeper-3.2.2.jar:/opt/hadoop/hbase/conf:/opt/hadoop/hbase/hbase-0.20.5.jar >> > 10/06/29 16:03:22 INFO hbase.PerformanceEvaluation: Table {NAME => >> > 'TestTable', FAMILIES => [{NAME => 'info', COMPRESSION => 'NONE', >> VERSIONS >> > => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', >> > BLOCKCACHE => 'true'}]} created >> > 10/06/29 16:03:22 INFO hbase.PerformanceEvaluation: Start class >> > org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at >> offset >> > 0 for 1048576 rows >> > 10/06/29 16:03:37 INFO hbase.PerformanceEvaluation: 0/104857/1048576 >> > 10/06/29 16:03:52 INFO hbase.PerformanceEvaluation: 0/209714/1048576 >> > 10/06/29 16:04:09 INFO hbase.PerformanceEvaluation: 0/314571/1048576 >> > 10/06/29 16:04:27 INFO hbase.PerformanceEvaluation: 0/419428/1048576 >> > 10/06/29 16:06:06 ERROR hbase.PerformanceEvaluation: Failed >> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to >> contact >> > region server Some server, retryOnlyOne=true, index=0, islastrow=false, >> > tries=9, numtries=10, i=0, listsize=9650, region=TestTable,,1277816601856 >> > for region TestTable,,1277816601856, row '0000511450', but failed after >> 10 >> > attempts. >> > Exceptions: >> > >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1149) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230) >> > at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:621) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:637) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:889) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation.runNIsOne(PerformanceEvaluation.java:907) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:939) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation.doCommandLine(PerformanceEvaluation.java:1036) >> > at >> > >> org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:1061) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> > at >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> > >> > Looks like it happens on region splits. >> > >> > Also, some other strange things: >> > 1. After writing something to TestTable, some regionservers log these: >> > 2010-06-29 16:05:06,458 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN: >> > .META.,,1 >> > >> > After that, there comes another .META. region on this server in 'status >> > detailed' output. >> > Even more, sometimes it opens not only .META., but also other regions >> from >> > data tables, such as these TestTable from performanceEvaluation. >> > >> > 2. After disabling and removing table its regions still look like >> assigned >> > to regionservers in 'status detailed'. >> > >> > 3. After 3-4 tries to write into TestTable from performanceEvaluation >> there >> > another strange thing: >> > 10/06/29 16:01:59 ERROR hbase.PerformanceEvaluation: Failed >> > org.apache.hadoop.hbase.TableExistsException: >> > org.apache.hadoop.hbase.TableExistsException: TestTable >> > >> > but, table not exists. You cannot disable and drop it, and hbase shells >> not >> > lists it in 'list' output. But you also cannot create it, because it is >> > "exists", and it's regions are assigned to regionservers. Note, nobody >> drops >> > this table. >> > >> > I spent some time to find out why this all happens, trying to play around >> > hadoop cluster versions (first it was cloudera, than "vanilla" 0.20.2), >> but >> > still having this issue. So, I will hope, if someone help to find cause >> for >> > this. >> > >> > -- >> > Regards, >> > Stanislaw Kogut >> > > > > -- > Regards, > Stanislaw Kogut > Sistyma LLC >
