I'm running it here, but I just remembered about this issue: "HTable.ClientScanner needs to clone the Scan object" https://issues.apache.org/jira/browse/HBASE-4891
And since you are reusing that Scan object, you could definitely hit this issue. J-D On Tue, May 29, 2012 at 11:37 PM, Ondřej Stašek <[email protected]> wrote: > Here it is: > > http://pastebin.com/0AgsQjur > > > On 29.5.2012 22:44, Jean-Daniel Cryans wrote: >> >> Care to share that TestPutScan? Just attach it in a pastebin >> >> Thx, >> >> J-D >> >> On Tue, May 29, 2012 at 6:13 AM, Ondřej Stašek >> <[email protected]> wrote: >>> >>> My program writes changes to HBase table by issuing lots of Puts >>> (autoCommit >>> turned off, flush on end) and afterwards uses ResultScanner on whole >>> table >>> to read all rows and act upon them. My problem is that on several >>> occasions >>> scan does not return expected rows. Either scan does not start on the >>> beginning of table or somewhere during scan I got old data (not those >>> written by Puts before). >>> >>> I have even written simple test application to simulate this behavior: >>> 1. write 1M simple numbered rows to a table >>> 2. scan through table to test output, delete every 10th row >>> 3. scan again after delete >>> 4. repeat until error found >>> >>> Sample output: >>> >>> 12/05/29 00:32:12 INFO hbase.TestPutScan: Run 342 put 1000000 rows >>> 12/05/29 00:32:35 INFO hbase.TestPutScan: Run 342 scan + del every 10th >>> row >>> 12/05/29 00:33:29 INFO hbase.TestPutScan: Run 342 scan >>> 12/05/29 00:33:29 ERROR hbase.TestPutScan: Expected value: value 0000001 >>> 0000342, got: value 0281999 0000342 >>> >>> This means, that program expected to get first row, but got 281999th. >>> >>> This test ran on "minicluster" of 2 regionservers runing Cloudera's >>> cdh3u4 >>> distribution. >>> >>> Today I got 3 errors like that and from RS's log it seems that in the >>> same >>> time hbase balancer issued reassign command for this table region (table >>> have only 1 region). >>> >>> Any pointers on what to check or what to send you to help resolve this >>> issue? >>> >>> Regards >>> >>> Ondrej Stasek >>> > > > -- > Ondřej Stašek > Programátor senior > Seznam.cz, a.s. > Nádražní 159/21 > 370 01 České Budějovice 6 > > tel.: +420 386 325 467 > gsm: +420 603 857 602 > icq: 164660005 > [email protected] > http://www.seznam.cz >
