[
https://issues.apache.org/jira/browse/HBASE-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723766#comment-14723766
]
Shradha Revankar commented on HBASE-14307:
------------------------------------------
Thanks [~cnauroth], I applied the patch and did some tests, while this might
not be relevant to hbase, from a webhdfs perspective, this approach has a
performance impact, mainly due the way the read() is implemented in webhdfs
client :
{code}
@Override
public int read(long position, byte[] buffer, int offset, int length)
throws IOException {
try (InputStream in = openInputStream(position).in) {
return in.read(buffer, offset, length);
}
}
{code}
Every read at a position opens a new connection and sends a new OPEN request to
the server with the specified offset. So when you loop within
{{positionalReadWithExtra}}, it sends as many requests to the server as the
loop count, the perf impact is very visible with chunked-encoding when the
returned bytes per read is only around 24 bytes.
>From an hbase perpective this fix is still useful and the right way, but I
>think we have to revisit - HDFS-8943
> Incorrect use of positional read api in HFileBlock
> --------------------------------------------------
>
> Key: HBASE-14307
> URL: https://issues.apache.org/jira/browse/HBASE-14307
> Project: HBase
> Issue Type: Bug
> Reporter: Shradha Revankar
> Assignee: Chris Nauroth
> Attachments: HBASE-14307.001.master.patch,
> HBASE-14307.002.master.patch
>
>
> Considering that {{read()}} is not guaranteed to read all bytes,
> I'm interested to understand this particular piece of code and why is partial
> read treated as an error :
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1446-L1450
> Particularly, if hbase were to use a different filesystem, say
> WebhdfsFileSystem, this would not work, please also see
> https://issues.apache.org/jira/browse/HDFS-8943 for discussion around this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)