[jira] Commented: (HADOOP-3205) Read multiple chunks directly from FSInputChecker subclass into user buffers

Todd Lipcon (JIRA) Wed, 02 Dec 2009 20:12:48 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785171#action_12785171
 ]


Todd Lipcon commented on HADOOP-3205:
-------------------------------------

I did some further investigation on this:

- I ran the 1GB cat test using hprof=cpu=times to get accurate invocation 
counts for the various read calls. Increasing MAX_CHUNKS from 1 to 16 does 
exactly what's expected and reduces the number of calls to readChunks (and thus 
the input stream reads, etc) by exactly a factor of 16. The same is true of 128 
- no noticeable differences. This is because System.arraycopy doesn't get 
accounted by hprof in this mode, for whatever reason.

- I imported a copy of the BufferedInputStream source and made 
BufferedFSInputStream extend from it rather than from the java.io one. I added 
a System.err printout right before the System.arraycopy inside read1(). When I 
changed MAX_CHUNKS over from 127 to 128, I verified that it correctly avoided 
these copies and read directly into the buffer. So the goal of the JIRA to get 
rid of a copy was indeed accomplished.

Now the confusing part: eliminating this copy does nothing in terms of 
performance. Comparing the MAX_CHUNKS=127 to MAX_CHUNKS=128 has no 
statistically significant effect on the speed of catting 1G from RAM.

So my best theory right now on why it's faster is that it's simply doing fewer 
function calls, each of which does more work with longer loops. This is better 
for loop unrolling, instruction cache locality, and avoiding function call 
overhead. Perhaps it inspires the JIT to work harder as well - who knows what 
black magic lurks there :)

I think at this point I've sufficiently investigated this, unless anyone has 
questions. I'll make the changes that Eli suggested and upload a new patch.

> Read multiple chunks directly from FSInputChecker subclass into user buffers
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-3205
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3205
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Raghu Angadi
>            Assignee: Todd Lipcon
>         Attachments: hadoop-3205.txt, hadoop-3205.txt, hadoop-3205.txt, 
> hadoop-3205.txt
>
>
> Implementations of FSInputChecker and FSOutputSummer like DFS do not have 
> access to full user buffer. At any time DFS can access only up to 512 bytes 
> even though user usually reads with a much larger buffer (often controlled by 
> io.file.buffer.size). This requires implementations to double buffer data if 
> an implementation wants to read or write larger chunks of data from 
> underlying storage.
> We could separate changes for FSInputChecker and FSOutputSummer into two 
> separate jiras.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3205) Read multiple chunks directly from FSInputChecker subclass into user buffers

Reply via email to