I'm running hbase and hadoop-0.17 trunk code as of earlier today (without HBASE-10). While loading 50m records into a table with ~800,000 rows with only one column family. This is a 3 node DFS and 3 region servers. I load the data from one of these three boxes. Once awhilte, I got NotServingRegion exception, the code looks like
BatchUpdate bu = new BatchUpdate(row) bu.put(...) table.commit(bu) When I examine region server's log, it shows something like: 08/04/18 01:51:14 open the region in question 08/04/18 01:51:15 region available 08/04/18 01:51:15 starting compaction 08/04/18 01:51:22 region closed 08/04/18 01:51:41 NotServingRegion Exception 08/04/18 01:51:47 compaction done 08/04/18 01:51:51 NotServingRegion Exception 08/04/18 01:52:01 NotServingRegion Exception 08/04/18 01:52:11 NotServingRegion Exception 08/04/18 01:52:21 NotServingRegion Exception 08/04/18 01:52:47 open the region in question 08/04/18 01:52:47 region avilable the master log somehow got truncated, IIRC, the master tried to assign the region to this region server some where between 01:51:22 and 01:51:41. >From my understanding, this region server is a little busy so it does not accept the assignment from the master. I'm wondering if this is caused by too busy regionsserver (the request per sec on each region server is about 1000), and if so, what configuration variables should I tune with? In addition, what would be the best practices when writing client by java to deal with such exception (as NotServingRegion should be common on a very busy HBase instance, I think). BTW, I was getting lots of different strange failures when doing the same thing on hadoop-0.16.X and hbase-0.1.X. After switching to hbase trunk, I only get the error above. It seems there are no more mysterious exceptions :-D Thanks, Rong-En Fan
