Hey, The way to fix this is to combine regions with the Merge tool. If your table is small you could combine all regions (pair-wise at a time).
If your table is too large, you can merge regions that are 'wacky' with adjacent members that are ok. For example: Region1 A->B Region2 B->D Region3 B->C Region4 C->D Region5 D->E In this case, regions 2-4 are "weird". If they were merged you'd end up with 3 regions: Region1 A->B RegionNew B->D Region5 D->E And all would be ok. In this case you need to do 2 merges: merge 2-3 -> A merge A-4 -> New This example can be extended to any number of weird regions. Don't worry about if the resulting regions are too big, HBase will split when it opens them. The merge tool is available like so: bin/hbase org.apache.hadoop.hbase.util.Merge It takes the table name and the region names. Be sure to copy those before you take your cluster offline or you might find it hard to find the region names! Good luck! -ryan On Mon, May 17, 2010 at 6:09 PM, Buttler, David <buttl...@llnl.gov> wrote: > Hi all, > I recently upgraded to 0.20.4. I am not trying to add additional data to my > system, and I am getting the following error on my client > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact > region server Some server, retryOnlyOne=true, index=0, islastrow=true, > tries=9, numtries=10, i=1, listsize=2, > region=doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679 for region > doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679, row > '7e0b8ec68d795612df55144b67e207bdf805d36f', but failed after 10 attempts. > Exceptions: > > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1167) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1248) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666) > at > trinidad.hbase.mapreduce.ingest.ImportWoS$WoSParserMapper.cleanup(ImportWoS.java:192) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > When I look at the region server log, I see errors like: > > 2010-05-17 17:47:11,685 DEBUG > org.apache.hadoop.hbase.regionserver.HRegionServer: Batch puts interrupted at > index=0 because:Requested row out of range for HRegion > doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679, > startKey='7d6442c7951b178a6adc9c149ff13d6ea87feccd', > getEndKey()='7ddf19f548f2a75c53a638a4bdc88084f806be4e', > row='7e3d88c2ed5e2b02fe374333fb5d7502c6c5ff45' > > To me, it looks like the table has gaps between the end of one region and the > beginning of the next region. E.g., from the list of regions from the doc > table: > doc,005bccc8dcd6ae360b359f42438fd1a651c02048,1274141748324 > node-03:60030 51561009 > 005bccc8dcd6ae360b359f42438fd1a651c02048 > 00d79413bba4fbd869b0b58c3b23ad2b6fc960b4 > doc,00d79413bba4fbd869b0b58c3b23ad2b6fc960b4,1274141747257 > node-02:60030 494463444 > 00d79413bba4fbd869b0b58c3b23ad2b6fc960b4 > 013485105e0d328d465b2607057f92cb5f920011 > ... > doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679 > node-03:60030 1541672177 > 7d6442c7951b178a6adc9c149ff13d6ea87feccd > 7ddf19f548f2a75c53a638a4bdc88084f806be4e > doc,7e7b8dbcec790d28f4154e012226f6d6902a5ac9,1274142333168 > node-03:60030 1688440578 > 7e7b8dbcec790d28f4154e012226f6d6902a5ac9 > 7ee05fd423269986ceb0dd88b1e4f73de42c5c5e > ... > > It looks like the first couple of regions are fine, but later regions have > gaps. > > I tried restarting hbase, doing a major compaction, and splitting the > regions, none of which fixed the problem. I was thinking of trying to copy > the table and seeing if that helped, but I can't seem to run the > copy_table.rb script either: > [had...@nz bin]$ /opt/hbase/bin/hbase org.jruby.Main copy_table.rb > file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/core_ext/object.rb:33:in > `get_proxy_or_package_under_package': cannot load Java class > org.apache.hadoop.hbase.regionserver.HLogEdit (NameError) > from > file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/java.rb:51:in > `method_missing' > from copy_table.rb:40 > > > Any suggestions? > > Thanks, > Dave >