Hi Tomaz, What does "hbase hbck" report? Maybe you have a broken region of sorts?
-Todd On Tue, Dec 20, 2011 at 9:45 AM, Tomaz Logar <[email protected]> wrote: > > Hello, everybody. > > I hit a strange snag in HBase today. I have a table with 48 regions spread > over 8 regionservers. It grows by about one region per day. It's like 6M > small (30-100 bytes each) records at the moment, 3.2G of Snappy-encoded data > on disks. > > What happened is that suddenly I can't scan over any previously inserted > data in just one table. Freshly put data seems to be ok: > > --- > hbase(main):035:0> put 'table', "\x00TEST", "*:t", "TEST" > 0 row(s) in 0.0300 seconds > > hbase(main):041:0* scan 'table', {STARTROW=>"\x00TEST", LIMIT=>2} > ROW COLUMN+CELL > \x00TEST column=*:t, timestamp=1324392041600, value=TEST > ERROR: java.lang.RuntimeException: > org.apache.hadoop.hbase.regionserver.LeaseException: > org.apache.hadoop.hbase.regionserver.LeaseException: lease > '-1785731371547934030' does not exist > --- > > So scan gets the record I put just before, but times out on old record that > comes right after it. :( > > If I target an old record I don't even get an exception, just a huge > timeout, no exception in regionserver log either: > --- > hbase(main):049:0> scan 'table', {STARTROW=>"0ua", LIMIT=>1} > ROW COLUMN+CELL > 0 row(s) in 146.2210 seconds > --- > > It may be relevant that I'm getting these on another, much bigger (3T > Snappy, 7+B records), yet working table: > --- > 11/12/20 17:50:37 WARN ipc.HBaseServer: IPC Server Responder, call > next(-15185895745499515, 1) from 192.168.32.192:64307: output error > 11/12/20 17:50:37 WARN ipc.HBaseServer: IPC Server handler 5 on 60020 > caught: java.nio.channels.ClosedChannelException > 11/12/20 17:32:43 WARN snappy.LoadSnappy: Snappy native library is available > --- > But these scans seem to recover while map-reducing. > > I'm running hbase-0.90.4-cdh3u2 from Cloudera SCM bundle on mixed nodes (5 * > 2 core 4G RAM, 3 * 12 core 16G RAM) with 1.5G RAM allocated for each HBase > regionserver. > > > Can anyone share some wisdom? Anyone got a similar half-broken problem > solved before? > > > Thanks, > > T. > > -- Todd Lipcon Software Engineer, Cloudera
