There you go: 12/05/30 18:54:17 DEBUG client.MetaScanner: Scanning .META. starting at row=testtable,,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@f593af 12/05/30 18:54:17 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for testtable,test_row_0496107,1338404055995.e9c7a4ca97eb2be372445af4d3772031. is sv4r25s44:62023 12/05/30 18:54:17 DEBUG client.HConnectionManager$HConnectionImplementation: Removed testtable,,1338404055995.9389fe5538f19a6f2df27e3958dcb434. for tableName=testtable from cache because of test_row_0012550 12/05/30 18:54:17 DEBUG client.HConnectionManager$HConnectionImplementation: Cached location for testtable,,1338404055995.9389fe5538f19a6f2df27e3958dcb434. is sv4r25s44:62023 12/05/30 18:57:47 INFO hbase.TestPutScan: Run 5 scan 12/05/30 18:57:47 ERROR hbase.TestPutScan: Expected value: value 0000001 0000005, got: value 0496107 0000005
That's a split so the ClientScanner did a reset on the start row. So I'm going to fix your code and see if I can get anything else. J-D On Wed, May 30, 2012 at 11:56 AM, Jean-Daniel Cryans <[email protected]> wrote: > I'm running it here, but I just remembered about this issue: > > "HTable.ClientScanner needs to clone the Scan object" > https://issues.apache.org/jira/browse/HBASE-4891 > > And since you are reusing that Scan object, you could definitely hit this > issue. > > J-D > > On Tue, May 29, 2012 at 11:37 PM, Ondřej Stašek > <[email protected]> wrote: >> Here it is: >> >> http://pastebin.com/0AgsQjur >> >> >> On 29.5.2012 22:44, Jean-Daniel Cryans wrote: >>> >>> Care to share that TestPutScan? Just attach it in a pastebin >>> >>> Thx, >>> >>> J-D >>> >>> On Tue, May 29, 2012 at 6:13 AM, Ondřej Stašek >>> <[email protected]> wrote: >>>> >>>> My program writes changes to HBase table by issuing lots of Puts >>>> (autoCommit >>>> turned off, flush on end) and afterwards uses ResultScanner on whole >>>> table >>>> to read all rows and act upon them. My problem is that on several >>>> occasions >>>> scan does not return expected rows. Either scan does not start on the >>>> beginning of table or somewhere during scan I got old data (not those >>>> written by Puts before). >>>> >>>> I have even written simple test application to simulate this behavior: >>>> 1. write 1M simple numbered rows to a table >>>> 2. scan through table to test output, delete every 10th row >>>> 3. scan again after delete >>>> 4. repeat until error found >>>> >>>> Sample output: >>>> >>>> 12/05/29 00:32:12 INFO hbase.TestPutScan: Run 342 put 1000000 rows >>>> 12/05/29 00:32:35 INFO hbase.TestPutScan: Run 342 scan + del every 10th >>>> row >>>> 12/05/29 00:33:29 INFO hbase.TestPutScan: Run 342 scan >>>> 12/05/29 00:33:29 ERROR hbase.TestPutScan: Expected value: value 0000001 >>>> 0000342, got: value 0281999 0000342 >>>> >>>> This means, that program expected to get first row, but got 281999th. >>>> >>>> This test ran on "minicluster" of 2 regionservers runing Cloudera's >>>> cdh3u4 >>>> distribution. >>>> >>>> Today I got 3 errors like that and from RS's log it seems that in the >>>> same >>>> time hbase balancer issued reassign command for this table region (table >>>> have only 1 region). >>>> >>>> Any pointers on what to check or what to send you to help resolve this >>>> issue? >>>> >>>> Regards >>>> >>>> Ondrej Stasek >>>> >> >> >> -- >> Ondřej Stašek >> Programátor senior >> Seznam.cz, a.s. >> Nádražní 159/21 >> 370 01 České Budějovice 6 >> >> tel.: +420 386 325 467 >> gsm: +420 603 857 602 >> icq: 164660005 >> [email protected] >> http://www.seznam.cz >>
