[ 
https://issues.apache.org/jira/browse/HBASE-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HBASE-14307:
----------------------------------
    Attachment: HBASE-14307.002.master.patch

[~ram_krish] and [[email protected]], thank you for your code reviews.

bq. Should we perf test this change?

This is my first HBase contribution.  Can you please advise if there is a 
standard way that the HBase community handles perf testing?

My expectation is that there will be no impact on existing usage patterns.  The 
read will continue to be satisfied fully by a single read API call, and the 
loop will exit after a single iteration.  This change only helps for 
alternative file system implementations that might return a partial read, like 
what Shradha is testing.  That will result in more iterations of the loop/more 
read calls, but this is an implementation detail of the file system.

bq. What if extraLen is 0 ? The method would return false. Is that intentional ?

Yes, this is intentional.  The method should return false if reading extra was 
not requested.  This is equivalent to this piece of logic that was in place 
before the patch:

{code}
        if (ret == size || ret < size + extraSize) {
          // Could not read the next block's header, or did not try.
           return -1;
        }
{code}

I think the return value comment in my JavaDoc was misleading though, so here 
is patch v2 with an improvement.

bq. Parameter validation on extraLen should be performed at the beginning of 
the method.

{{extraLen}} is passed by internal methods, and it's always either the header 
size or 0.  I see possible header sizes are sourced from either 
{{HConstants#HFILEBLOCK_HEADER_SIZE}} or 
{{HConstants#HFILEBLOCK_HEADER_SIZE_NO_CHECKSUM}}, and both of these constants 
are greater than zero.  Given that the input value always comes from internal 
constants instead of untrusted external input, I would not expect additional 
parameter validation is necessary.  Let me know if you disagree.  If parameter 
validation is required, then I'd suggest validation should be done in the 
existing {{readWithExtra}} method too for consistency.

bq. positionalReadWithExtra() is a static method. Should it be in IOUtils ?

I placed the method here based on the existing precedent of the 
{{readWithExtra}} method.  Is there an HBase-specific {{IOUtils}} class?  I 
couldn't find one.  It would not be appropriate in the Hadoop Common 
{{IOUtils}} class, because this is a very application-specific usage pattern.

Let me know if you think there is a better spot for this method.  I'd suggest 
both {{readWithExtra}} and {{positionalReadWithExtra}} should be in the same 
class for consistency, so if one moves, then we probably should move both.

> Incorrect use of positional read api in HFileBlock
> --------------------------------------------------
>
>                 Key: HBASE-14307
>                 URL: https://issues.apache.org/jira/browse/HBASE-14307
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Shradha Revankar
>            Assignee: Chris Nauroth
>         Attachments: HBASE-14307.001.master.patch, 
> HBASE-14307.002.master.patch
>
>
> Considering that {{read()}} is not guaranteed to read all bytes, 
> I'm interested to understand this particular piece of code and why is partial 
> read treated as an error :
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1446-L1450
> Particularly, if hbase were to use a different filesystem, say 
> WebhdfsFileSystem, this would not work, please also see 
> https://issues.apache.org/jira/browse/HDFS-8943 for discussion around this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to