I'm unit testing some of our internal framework that interacts with hbase
(version cdh3b3 => 0.89.20100924+28). The unit test is small inserting and
updating ~100 rows. My unit test expects the hbase table to be empty when
starting. So I go into hbase shell table disable the table, drop it, and
then recreate it. However hbase is having problems when I try to disable
the table "surveySession". In the region server holding the region I see
the following in the log:
2010-11-11 16:44:39,387 INFO org.apache.hadoop.hbase.
regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:39,387 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:39,387 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Closing
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.: disabling
compactions& flushes
2010-11-11 16:44:39,387 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:39,387 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for
region surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07..
Current region memstore size 242.7k
2010-11-11 16:44:39,388 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting,
commencing flushing stores
2010-11-11 16:44:39,426 INFO
com.hadoop.compression.lzo.GPLNativeCodeLoader: Loaded native gpl library
2010-11-11 16:44:39,454 INFO com.hadoop.compression.lzo.LzoCodec:
Successfully loaded& initialized native-lzo library [hadoop-lzo rev
6a0ca93ae7b5070f61cca4be2d96a29d184a48ef]
2010-11-11 16:44:39,456 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new compressor
2010-11-11 16:44:39,478 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.DroppedSnapshotException: region:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:39,478 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:40,391 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:40,391 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:41,395 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:41,395 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:42,399 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:42,399 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:44,408 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:44,408 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:46,416 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:46,416 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:50,436 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:50,436 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:54,454 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:44:54,454 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:45:02,487 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:45:02,488 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:45:18,557 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
2010-11-11 16:45:18,557 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
MSG_REGION_CLOSE:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
The only potential problem I'm seeing is the WARN:
2010-11-11 16:44:39,478 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.DroppedSnapshotException: region:
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07.
Then if I run a hbase hbck I get:
ERROR: Region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. is not served
by any region server but is listed in META to be on server
ds2.internal:60020
The region server on port 60020 says that it is not servering the region.
At this point the table is "stuck." I can't disable it, I can't insert
any records into it (RegionNotServingException), nor do anything with it
really.
I can "fix" the problem by shutting down the region server forcing the
master reassign all the regions then disable it on the new region server. It
does seem to be somewhat time dependent because I have to try to disable the
table after insert the records. Waiting a couple of hours then doing it
there seems to be no problem.
Master Log entries at the time of disabling:
2010-11-11 16:44:38,650 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:38,650 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:38,651 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:39,669 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:39,669 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:39,670 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:40,674 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:40,675 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:40,675 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:41,689 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:41,689 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:41,689 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:43,704 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:43,704 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:43,704 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:45,718 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:45,718 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:45,719 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:49,733 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:49,733 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:49,733 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:53,747 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:44:53,747 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:44:53,747 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:44:54,157 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scanning meta region {server: 192.168.1.168:60020
<http://192.168.1.168:60020/>, regionname: .META.,,1.1028785192, startKey:
<>}
2010-11-11 16:44:54,167 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scan of 21 row(s) of meta region {server:
192.168.1.168:60020<http://192.168.1.168:60020/>, regionname:
.META.,,1.1028785192, startKey:<>} complete
2010-11-11 16:44:54,167 INFO org.apache.hadoop.hbase.master.BaseScanner:
All 1 .META. region(s) scanned
2010-11-11 16:45:01,762 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:45:01,762 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:45:01,762 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:45:06,616 INFO org.apache.hadoop.hbase.master.ServerManager:
4 region servers, 0 dead, average load 5.5
2010-11-11 16:45:13,556 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit
for row<> in tableName surveySession: location server ds1.internal:60020,
location region name
surveySession,,1289352541018.ecd63368b7933af078b9a714b715d6d4.
2010-11-11 16:45:13,557 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit
for row<> in tableName .META.: location server ds2.internal:60020, location
region name .META.,,1.1028785192
2010-11-11 16:45:13,564 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
Rowscanned=1, rowsOffline=0
2010-11-11 16:45:16,110 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit
for row<> in tableName surveySession: location server ds1.internal:60020,
location region name
surveySession,,1289352541018.ecd63368b7933af078b9a714b715d6d4.
2010-11-11 16:45:16,111 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit
for row<> in tableName .META.: location server ds2.internal:60020, location
region name .META.,,1.1028785192
2010-11-11 16:45:16,117 DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
Rowscanned=1, rowsOffline=0
2010-11-11 16:45:17,780 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
2010-11-11 16:45:17,780 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
currently being served
2010-11-11 16:45:17,780 DEBUG
org.apache.hadoop.hbase.master.ChangeTableState: Adding region
surveySession,,1289518197042.29b693fa3ef2edc940fd979ba9ce2c07. to setClosing
list
2010-11-11 16:45:30,993 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scanning meta region {server: 192.168.1.169:60020
<http://192.168.1.169:60020/>, regionname: -ROOT-,,0.70236052, startKey:<>}
2010-11-11 16:45:30,996 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scan of 1 row(s) of meta region {server:
192.168.1.169:60020<http://192.168.1.169:60020/>, regionname:
-ROOT-,,0.70236052, startKey:<>} complete
2010-11-11 16:45:54,160 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scanning meta region {server: 192.168.1.168:60020
<http://192.168.1.168:60020/>, regionname: .META.,,1.1028785192, startKey:
<>}
2010-11-11 16:45:54,172 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scan of 21 row(s) of meta region {server:
192.168.1.168:60020<http://192.168.1.168:60020/>, regionname:
.META.,,1.1028785192, startKey:<>} complete
2010-11-11 16:45:54,172 INFO org.apache.hadoop.hbase.master.BaseScanner:
All 1 .META. region(s) scanned
2010-11-11 16:46:06,623 INFO org.apache.hadoop.hbase.master.ServerManager:
4 region servers, 0 dead, average load 5.5
~Jeff
--
Jeff Whiting
Qualtrics Senior Software Engineer
[email protected]<mailto:[email protected]>