[ 
https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773199#action_12773199
 ] 

Todd Lipcon commented on HADOOP-3205:
-------------------------------------

Looking at the source of BufferedInputStream (at 
http://www.docjar.com/html/api/java/io/BufferedInputStream.java.html) it 
actually seems like BufferedInputStream is already handling pass-through to the 
underlying stream in the case that the read buffer is as large as its own 
buffer. That was the crucial bit I was missing that explains why performing the 
underlying reads in larger chunks would make a difference, even without 
removing the BIS.

I'll give it a go and see if there is any discernible performance increase.

bq. Plus, looking at how long this jira has been open, it is no blocker

Of course :)

> FSInputChecker and FSOutputSummer should allow better access to user buffer
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-3205
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3205
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>
> Implementations of FSInputChecker and FSOutputSummer like DFS do not have 
> access to full user buffer. At any time DFS can access only up to 512 bytes 
> even though user usually reads with a much larger buffer (often controlled by 
> io.file.buffer.size). This requires implementations to double buffer data if 
> an implementation wants to read or write larger chunks of data from 
> underlying storage.
> We could separate changes for FSInputChecker and FSOutputSummer into two 
> separate jiras.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to