My program writes changes to HBase table by issuing lots of Puts (autoCommit turned off, flush on end) and afterwards uses ResultScanner on whole table to read all rows and act upon them. My problem is that on several occasions scan does not return expected rows. Either scan does not start on the beginning of table or somewhere during scan I got old data (not those written by Puts before).

I have even written simple test application to simulate this behavior:
1. write 1M simple numbered rows to a table
2. scan through table to test output, delete every 10th row
3. scan again after delete
4. repeat until error found

Sample output:

12/05/29 00:32:12 INFO hbase.TestPutScan: Run 342 put 1000000 rows
12/05/29 00:32:35 INFO hbase.TestPutScan: Run 342 scan + del every 10th row
12/05/29 00:33:29 INFO hbase.TestPutScan: Run 342 scan
12/05/29 00:33:29 ERROR hbase.TestPutScan: Expected value: value 0000001 0000342, got: value 0281999 0000342

This means, that program expected to get first row, but got 281999th.

This test ran on "minicluster" of 2 regionservers runing Cloudera's cdh3u4 distribution.

Today I got 3 errors like that and from RS's log it seems that in the same time hbase balancer issued reassign command for this table region (table have only 1 region).

Any pointers on what to check or what to send you to help resolve this issue?

Regards

Ondrej Stasek

Reply via email to