Hi all, I recently upgraded to 0.20.4. I am not trying to add additional data to my system, and I am getting the following error on my client
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, numtries=10, i=1, listsize=2, region=doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679 for region doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679, row '7e0b8ec68d795612df55144b67e207bdf805d36f', but failed after 10 attempts. Exceptions: at org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1167) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1248) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666) at trinidad.hbase.mapreduce.ingest.ImportWoS$WoSParserMapper.cleanup(ImportWoS.java:192) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) When I look at the region server log, I see errors like: 2010-05-17 17:47:11,685 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Batch puts interrupted at index=0 because:Requested row out of range for HRegion doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679, startKey='7d6442c7951b178a6adc9c149ff13d6ea87feccd', getEndKey()='7ddf19f548f2a75c53a638a4bdc88084f806be4e', row='7e3d88c2ed5e2b02fe374333fb5d7502c6c5ff45' To me, it looks like the table has gaps between the end of one region and the beginning of the next region. E.g., from the list of regions from the doc table: doc,005bccc8dcd6ae360b359f42438fd1a651c02048,1274141748324 node-03:60030 51561009 005bccc8dcd6ae360b359f42438fd1a651c02048 00d79413bba4fbd869b0b58c3b23ad2b6fc960b4 doc,00d79413bba4fbd869b0b58c3b23ad2b6fc960b4,1274141747257 node-02:60030 494463444 00d79413bba4fbd869b0b58c3b23ad2b6fc960b4 013485105e0d328d465b2607057f92cb5f920011 ... doc,7d6442c7951b178a6adc9c149ff13d6ea87feccd,1274142309679 node-03:60030 1541672177 7d6442c7951b178a6adc9c149ff13d6ea87feccd 7ddf19f548f2a75c53a638a4bdc88084f806be4e doc,7e7b8dbcec790d28f4154e012226f6d6902a5ac9,1274142333168 node-03:60030 1688440578 7e7b8dbcec790d28f4154e012226f6d6902a5ac9 7ee05fd423269986ceb0dd88b1e4f73de42c5c5e ... It looks like the first couple of regions are fine, but later regions have gaps. I tried restarting hbase, doing a major compaction, and splitting the regions, none of which fixed the problem. I was thinking of trying to copy the table and seeing if that helped, but I can't seem to run the copy_table.rb script either: [had...@nz bin]$ /opt/hbase/bin/hbase org.jruby.Main copy_table.rb file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/core_ext/object.rb:33:in `get_proxy_or_package_under_package': cannot load Java class org.apache.hadoop.hbase.regionserver.HLogEdit (NameError) from file:/opt/hbase-0.20.4/lib/jruby-complete-1.2.0.jar!/META-INF/jruby.home/lib/ruby/site_ruby/1.8/builtin/javasupport/java.rb:51:in `method_missing' from copy_table.rb:40 Any suggestions? Thanks, Dave