[
https://issues.apache.org/jira/browse/HBASE-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HBASE-14307:
----------------------------------
Attachment: HBASE-14307.002.master.patch
[~ram_krish] and [[email protected]], thank you for your code reviews.
bq. Should we perf test this change?
This is my first HBase contribution. Can you please advise if there is a
standard way that the HBase community handles perf testing?
My expectation is that there will be no impact on existing usage patterns. The
read will continue to be satisfied fully by a single read API call, and the
loop will exit after a single iteration. This change only helps for
alternative file system implementations that might return a partial read, like
what Shradha is testing. That will result in more iterations of the loop/more
read calls, but this is an implementation detail of the file system.
bq. What if extraLen is 0 ? The method would return false. Is that intentional ?
Yes, this is intentional. The method should return false if reading extra was
not requested. This is equivalent to this piece of logic that was in place
before the patch:
{code}
if (ret == size || ret < size + extraSize) {
// Could not read the next block's header, or did not try.
return -1;
}
{code}
I think the return value comment in my JavaDoc was misleading though, so here
is patch v2 with an improvement.
bq. Parameter validation on extraLen should be performed at the beginning of
the method.
{{extraLen}} is passed by internal methods, and it's always either the header
size or 0. I see possible header sizes are sourced from either
{{HConstants#HFILEBLOCK_HEADER_SIZE}} or
{{HConstants#HFILEBLOCK_HEADER_SIZE_NO_CHECKSUM}}, and both of these constants
are greater than zero. Given that the input value always comes from internal
constants instead of untrusted external input, I would not expect additional
parameter validation is necessary. Let me know if you disagree. If parameter
validation is required, then I'd suggest validation should be done in the
existing {{readWithExtra}} method too for consistency.
bq. positionalReadWithExtra() is a static method. Should it be in IOUtils ?
I placed the method here based on the existing precedent of the
{{readWithExtra}} method. Is there an HBase-specific {{IOUtils}} class? I
couldn't find one. It would not be appropriate in the Hadoop Common
{{IOUtils}} class, because this is a very application-specific usage pattern.
Let me know if you think there is a better spot for this method. I'd suggest
both {{readWithExtra}} and {{positionalReadWithExtra}} should be in the same
class for consistency, so if one moves, then we probably should move both.
> Incorrect use of positional read api in HFileBlock
> --------------------------------------------------
>
> Key: HBASE-14307
> URL: https://issues.apache.org/jira/browse/HBASE-14307
> Project: HBase
> Issue Type: Bug
> Reporter: Shradha Revankar
> Assignee: Chris Nauroth
> Attachments: HBASE-14307.001.master.patch,
> HBASE-14307.002.master.patch
>
>
> Considering that {{read()}} is not guaranteed to read all bytes,
> I'm interested to understand this particular piece of code and why is partial
> read treated as an error :
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1446-L1450
> Particularly, if hbase were to use a different filesystem, say
> WebhdfsFileSystem, this would not work, please also see
> https://issues.apache.org/jira/browse/HDFS-8943 for discussion around this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)