Joe,

We'll need to learn what happened to that region, they usually don't
throw up after a few inserts ;)

So in that region server's log, before you tried disabling that table,
do you see anything wrong (exceptions probably)? If you have a web
server, it would be nice to drop the full RS log and the master log.

thx!

J-D

On Wed, Mar 10, 2010 at 5:54 PM, Joe Pepersack <j...@pepersack.net> wrote:
> On 03/10/2010 07:58 PM, Jean-Daniel Cryans wrote:
>>
>> Which HBase version? What's your hardware like? How much data were you
>> inserting? Did you grep the region server logs for any IOException or
>> such? Can we see an excerpt of those logs around the time of the "lock
>> up"?
>>
>
> Version: 0.20.3-1.cloudera
> Hardware: dual Xeon 4 core, 16G, 1.7T disk
> 10x nodes: 1 master, 1 secondary master, 8x regionservers.  2x zookeepers
> running on regionservers
>
>
> It appears to have died after only a few rows were inserted.   There's only
> one region shown on the status page.  Curiously, that region does NOT show
> up in the list of online regions for the listed regionserver.
>
> Master log, from the point where I ran "drop 'Person'" in the shell:
>
> 010-03-10 20:44:44,812 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scanning meta region {server: 10.40.0.37:60020,
> regionname: -ROOT-,,0, startKey:<>}
> 2010-03-10 20:44:44,815 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.rootScanner scan of 1 row(s) of meta region {server:
> 10.40.0.37:60020, regionname: -ROOT-,,0, startKey:<>} complete
> 2010-03-10 20:44:44,836 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scanning meta region {server: 10.40.0.36:60020,
> regionname: .META.,,1, startKey:<>}
> 2010-03-10 20:44:44,844 INFO org.apache.hadoop.hbase.master.BaseScanner:
> RegionManager.metaScanner scan of 3 row(s) of meta region {server:
> 10.40.0.36:60020, regionname: .META.,,1, startKey:<>} complete
> 2010-03-10 20:44:44,844 INFO org.apache.hadoop.hbase.master.BaseScanner: All
> 1 .META. region(s) scanned
> 2010-03-10 20:44:45,357 INFO org.apache.hadoop.hbase.master.ServerManager: 5
> region servers, 0 dead, average load 1.2
> 2010-03-10 20:45:03,209 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
> 2010-03-10 20:45:03,209 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
> currently being served
> 2010-03-10 20:45:03,210 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Adding region
> Person,,1268251509658 to setClosing list
> 2010-03-10 20:45:04,260 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
> 2010-03-10 20:45:04,260 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
> currently being served
> 2010-03-10 20:45:04,260 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Adding region
> Person,,1268251509658 to setClosing list
> 2010-03-10 20:45:05,273 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
> 2010-03-10 20:45:05,273 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
> currently being served
> 2010-03-10 20:45:05,273 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Adding region
> Person,,1268251509658 to setClosing list
> 2010-03-10 20:45:06,287 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
> 2010-03-10 20:45:06,287 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
> currently being served
> 2010-03-10 20:45:06,287 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Adding region
> Person,,1268251509658 to setClosing list
> 2010-03-10 20:45:08,301 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing unserved regions
> 2010-03-10 20:45:08,301 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Processing regions
> currently being served
> 2010-03-10 20:45:08,301 DEBUG
> org.apache.hadoop.hbase.master.ChangeTableState: Adding region
> Person,,1268251509658 to setClosing list
>
>
> Log from the region server where the region is supposed to be for the same
> time frame:
>
> 2010-03-10 20:43:50,889 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes:
> Total=1.6213074MB (1700064), Free=195.8787MB (205393696), Max=197.5MB
> (207093760), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0,
> Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
> 2010-03-10 20:44:50,889 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes:
> Total=1.6213074MB (1700064), Free=195.8787MB (205393696), Max=197.5MB
> (207093760), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0,
> Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
> 2010-03-10 20:45:04,058 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:04,059 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:05,062 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:05,063 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:06,066 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:06,067 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:07,070 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:07,071 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:09,079 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:09,079 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:11,088 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:11,088 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:15,104 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE:
> Person,,1268251509658
> 2010-03-10 20:45:15,105 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Worker:
> MSG_REGION_CLOSE: Person,,1268251509658
> 2010-03-10 20:45:50,889 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes:
> Total=1.6213074MB (1700064), Free=195.8787MB (205393696), Max=197.5MB
> (207093760), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0,
> Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted/Run=NaN
>
>
>
>

Reply via email to