Hallo J-D.
Thanks for reply. I've modified my code to use scanner copies -
table.getScanner(new Scan(scan)) and run it again. Even after that I got
an error:
12/05/31 10:42:39 INFO hbase.TestPutScan: Run 5 put 1000000 rows
12/05/31 10:44:09 INFO hbase.TestPutScan: Run 5 scan + del every 10th row
12/05/31 10:44:33 ERROR hbase.TestPutScan: Expected value: value 0402040
0000005, got: value 0402041 0000004
It seems that 1 row was skipped during scan. Strange.
I'll keep testing.
Ondrej Stasek
On 30.5.2012 21:05, Jean-Daniel Cryans wrote:
There you go:
12/05/30 18:54:17 DEBUG client.MetaScanner: Scanning .META. starting
at row=testtable,,00000000000000 for max=10 rows using
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@f593af
12/05/30 18:54:17 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location
for testtable,test_row_0496107,1338404055995.e9c7a4ca97eb2be372445af4d3772031.
is sv4r25s44:62023
12/05/30 18:54:17 DEBUG
client.HConnectionManager$HConnectionImplementation: Removed
testtable,,1338404055995.9389fe5538f19a6f2df27e3958dcb434. for
tableName=testtable from cache because of test_row_0012550
12/05/30 18:54:17 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location
for testtable,,1338404055995.9389fe5538f19a6f2df27e3958dcb434. is
sv4r25s44:62023
12/05/30 18:57:47 INFO hbase.TestPutScan: Run 5 scan
12/05/30 18:57:47 ERROR hbase.TestPutScan: Expected value: value
0000001 0000005, got: value 0496107 0000005
That's a split so the ClientScanner did a reset on the start row. So
I'm going to fix your code and see if I can get anything else.
J-D
On Wed, May 30, 2012 at 11:56 AM, Jean-Daniel Cryans
<[email protected]> wrote:
I'm running it here, but I just remembered about this issue:
"HTable.ClientScanner needs to clone the Scan object"
https://issues.apache.org/jira/browse/HBASE-4891
And since you are reusing that Scan object, you could definitely hit this issue.
J-D
On Tue, May 29, 2012 at 11:37 PM, Ondřej Stašek
<[email protected]> wrote:
Here it is:
http://pastebin.com/0AgsQjur
On 29.5.2012 22:44, Jean-Daniel Cryans wrote:
Care to share that TestPutScan? Just attach it in a pastebin
Thx,
J-D
On Tue, May 29, 2012 at 6:13 AM, Ondřej Stašek
<[email protected]> wrote:
My program writes changes to HBase table by issuing lots of Puts
(autoCommit
turned off, flush on end) and afterwards uses ResultScanner on whole
table
to read all rows and act upon them. My problem is that on several
occasions
scan does not return expected rows. Either scan does not start on the
beginning of table or somewhere during scan I got old data (not those
written by Puts before).
I have even written simple test application to simulate this behavior:
1. write 1M simple numbered rows to a table
2. scan through table to test output, delete every 10th row
3. scan again after delete
4. repeat until error found
Sample output:
12/05/29 00:32:12 INFO hbase.TestPutScan: Run 342 put 1000000 rows
12/05/29 00:32:35 INFO hbase.TestPutScan: Run 342 scan + del every 10th
row
12/05/29 00:33:29 INFO hbase.TestPutScan: Run 342 scan
12/05/29 00:33:29 ERROR hbase.TestPutScan: Expected value: value 0000001
0000342, got: value 0281999 0000342
This means, that program expected to get first row, but got 281999th.
This test ran on "minicluster" of 2 regionservers runing Cloudera's
cdh3u4
distribution.
Today I got 3 errors like that and from RS's log it seems that in the
same
time hbase balancer issued reassign command for this table region (table
have only 1 region).
Any pointers on what to check or what to send you to help resolve this
issue?
Regards
Ondrej Stasek
--
Ondřej Stašek
Programátor senior
Seznam.cz, a.s.
Nádražní 159/21
370 01 České Budějovice 6
tel.: +420 386 325 467
gsm: +420 603 857 602
icq: 164660005
[email protected]
http://www.seznam.cz