[
https://issues.apache.org/jira/browse/HBASE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370740#comment-14370740
]
Lars Hofhansl edited comment on HBASE-13262 at 3/20/15 5:20 AM:
----------------------------------------------------------------
bq. So 0.98 has the issue too? (Jonathan Lawlor seems to indicate not? Maybe I
misread)
[~stack] So the general issue is that the counting on client and server needs
to match 100% in order for this to work .
There are multiple ways to make this fail:
* accidentally with incorrect code, which lead to this issue
* having client and server configured with different values for the max scanner
result size in 0.98 or 1.0 (that I demonstrate in the test in patch)
* probably more
The sizing on the client is fickle and bad. It _has_ to go.
The patch I propose just does away with the sizing on the client for 0.98 and
1.0. That will cause an extra RPC if the scanner caching is set such that it
would fire after the size limit, in that case we need the extra RPC to detect
that we're done with a region.
In 1.1 and later we can do what has been proposed here in various forms and add
some extra flag to the RPC to indicate whether we filled the batch rather than
trying to derive this information for the size of the results array.
In either case the matching size calculation on the client is bad and should be
removed in all cases.
Am I making sense? Maybe we should have two different jira...?
Edit: Fixed lot's of spelling errors...
was (Author: lhofhansl):
bq. So 0.98 has the issue too? (Jonathan Lawlor seems to indicate not? Maybe I
misread)
[~stack] So the general issue is that the counting on client and server need to
match 100% in order for this to work .
There are multiple ways to make this fail:
* accidentally with incorrect, which lead to this issue
* having client and server configured with different value for for the max
scanner result size in 0.98 or 1.0 (that I demonstrate in the test in patch)
* probably more
The sizing on the client is fickle and bad. It _has_ to go.
The patch I propose just does away with the sizing on the client for 0.98 and
1.0. That will cause an extra RPC if the scanner caching is set such that it
would fire after the size limit, in that case we need the extra RPC to detect
that we're doing with a region.
In 1.1 and later we do what has been proposed here in various forms and add
some extra flag to the RPC to indicate whether we filled the batch rather than
trying to derive this information for the size of the results array.
In either case the matching size calculation on the client is bad.
Am I making sense? Maybe we should have two different jira...?
> ResultScanner doesn't return all rows in Scan
> ---------------------------------------------
>
> Key: HBASE-13262
> URL: https://issues.apache.org/jira/browse/HBASE-13262
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 2.0.0, 1.1.0
> Environment: Single node, pseduo-distributed 1.1.0-SNAPSHOT
> Reporter: Josh Elser
> Assignee: Josh Elser
> Priority: Blocker
> Fix For: 2.0.0, 1.1.0, 0.98.13
>
> Attachments: 13262-0.98-testpatch.txt,
> 13262-tag-length-for-withTags-parameter.txt, regionserver-logging.diff,
> testrun_0.98.txt, testrun_branch1.0.txt
>
>
> Tried to write a simple Java client again 1.1.0-SNAPSHOT.
> * Write 1M rows, each row with 1 family, and 10 qualifiers (values [0-9]),
> for a total of 10M cells written
> * Read back the data from the table, ensure I saw 10M cells
> Running it against {{04ac1891}} (and earlier) yesterday, I would get ~20% of
> the actual rows. Running against 1.0.0, returns all 10M records as expected.
> [Code I was
> running|https://github.com/joshelser/hbase-hwhat/blob/master/src/main/java/hbase/HBaseTest.java]
> for the curious.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)