[ 
https://issues.apache.org/jira/browse/HBASE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370585#comment-14370585
 ] 

Jonathan Lawlor commented on HBASE-13262:
-----------------------------------------

[~elserj] Wow, nice digging, that looks like the cause!

It makes sense because when the cell block for the RPC response is being built 
within {{IPCUtil#buildCellBlock}}, the {{KeyValueEncoder}} that is used to 
write the cells from the cellScanner uses {{KeyValue#oswrite(final KeyValue kv, 
final OutputStream out, final boolean withTags)}} to write the cells to the 
output stream. Implementation pasted below:

{panel:title=KeyValue.java}
{noformat}
public static long oswrite(final KeyValue kv, final OutputStream out, final 
boolean withTags)
      throws IOException {
    // In KeyValueUtil#oswrite we do a Cell serialization as KeyValue. Any 
changes doing here, pls
    // check KeyValueUtil#oswrite also and do necessary changes.
    int length = kv.getLength();
    if (!withTags) {
      length = kv.getKeyLength() + kv.getValueLength() + 
KEYVALUE_INFRASTRUCTURE_SIZE;
    }
    // This does same as DataOuput#writeInt (big-endian, etc.)
    StreamUtils.writeInt(out, length);
    out.write(kv.getBuffer(), kv.getOffset(), length);
    return length + Bytes.SIZEOF_INT;
  }
{noformat}
{panel}

Notice that when withTags is false, we end up stripping those 2 bytes! The 
question now is why does 0.98 NOT have this problem. The implementation of 
ScannerV3#getCellBuffSize hasn't changed from 0.98 to branch-1.0 -- does that 
mean that in 0.98 we end up using ScannerV2#getCellBuffSize instead of V3... 
(I'm not too familiar with this area myself so maybe someone can confirm/deny)?

[~lhofhansl] I think you're right. That solution would ensure that there is no 
possibility for the client to preemptively move regions. While there is 
potentially some ambiguity behind why a particular {{Result[]}} was returned 
from the server when it is not empty (was the size limit reached? caching 
limit? end of region?), there is no ambiguity currently associated with seeing 
an empty {{Result[]}} (we know that the we must try to change regions). Having 
to eat an extra RPC certainly isn't ideal, but I don't know if there is a 
solution other than that one that would guarantee no chance of data loss.

> ResultScanner doesn't return all rows in Scan
> ---------------------------------------------
>
>                 Key: HBASE-13262
>                 URL: https://issues.apache.org/jira/browse/HBASE-13262
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.0, 1.1.0
>         Environment: Single node, pseduo-distributed 1.1.0-SNAPSHOT
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Blocker
>             Fix For: 2.0.0, 1.1.0, 0.98.13
>
>         Attachments: 13262-0.98-testpatch.txt, regionserver-logging.diff, 
> testrun_0.98.txt, testrun_branch1.0.txt
>
>
> Tried to write a simple Java client again 1.1.0-SNAPSHOT.
> * Write 1M rows, each row with 1 family, and 10 qualifiers (values [0-9]), 
> for a total of 10M cells written
> * Read back the data from the table, ensure I saw 10M cells
> Running it against {{04ac1891}} (and earlier) yesterday, I would get ~20% of 
> the actual rows. Running against 1.0.0, returns all 10M records as expected.
> [Code I was 
> running|https://github.com/joshelser/hbase-hwhat/blob/master/src/main/java/hbase/HBaseTest.java]
>  for the curious.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to