Hi, I've reinstalled HBase 0.20.0-rc2 yesterday on my 5 node cluster and reimported some data into it.
My data is imported via an MR job. The Mapper reads SequenceFiles, generates a new key for each value (unique across values and deterministic), and outputs the new K,V. The Reducer reads those records and inserts the V into an HTable, the row key being the K. The import MR job completes, showing 866,587,147 Map Input Records, Map Output Records and Reduce Input Records. The Reducer outputs the number of records it inserted into the HTable and the total across all 10 reducers comes handy at the same value of 866,587,147 (which is indeed how many records I have). Several Reducers attempts have failed with the following type of error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=false, tries=9, numtries=10, i=4856, listsize=13108, location=address: 10.154.99.180:60020, regioninfo: REGION => {NAME => 'foo,,1250702379325', STARTKEY => '', ENDKEY => '00AZRPXCWSF8W\xBEO\x7F\xFF\xFF\xFA', ENCODED => 9856138, TABLE => {{NAME => 'domirama', FAMILIES => [{NAME => 'copy', VERSIONS => '1', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}, region=domirama,,1250702379325 for region domirama,,1250702379325, row '00AZRPXCLZM7\x5E\xA0\xDF\x7F\xFF\xFF\xEE', but failed after 10 attempts. Exceptions: at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1041) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:582) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:448) at domirama.mapreduce.MR0004$Reducer.reduce(MR0004.java:235) at domirama.mapreduce.MR0004$Reducer.reduce(MR0004.java:151) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:543) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:410) at org.apache.hadoop.mapred.Child.main(Child.java:170) I then ran another MR job that counts the rows in the table and that job only found 866,166,470 records! There are a few errors in the regionserver logs (failed compactions or compacted files that could not be moved), but no errors related to the regions mentioned in the above errors. I already encountered an issue similar with rc1 and previously with trunk, so I guess there is still something in rc2 that makes my use case fail. Mathias.