[
https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated HADOOP-3205:
--------------------------------
Attachment: hadoop-3205.txt
New version of the patch. This addresses Eli's review comments, and adds some
extra tests (one for truncated checksum file throwing ChecksumException,
another for odd sized read buffers in a file with a few chunks). I also tidied
up some of the comments to make it clearer to implementors what's going on.
Just to be doubly sure, I reran all the benchmarks overnight and confirmed that
reading 32 chunks at once had all the performance improvement benefits of a
larger value (and uses less memory). Also reran HDFS-755 tests against this
build with assertions on and everything looked good (plenty of assertion
failures, but none in the new code!)
> Read multiple chunks directly from FSInputChecker subclass into user buffers
> ----------------------------------------------------------------------------
>
> Key: HADOOP-3205
> URL: https://issues.apache.org/jira/browse/HADOOP-3205
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Reporter: Raghu Angadi
> Assignee: Todd Lipcon
> Attachments: hadoop-3205.txt, hadoop-3205.txt, hadoop-3205.txt,
> hadoop-3205.txt, hadoop-3205.txt
>
>
> Implementations of FSInputChecker and FSOutputSummer like DFS do not have
> access to full user buffer. At any time DFS can access only up to 512 bytes
> even though user usually reads with a much larger buffer (often controlled by
> io.file.buffer.size). This requires implementations to double buffer data if
> an implementation wants to read or write larger chunks of data from
> underlying storage.
> We could separate changes for FSInputChecker and FSOutputSummer into two
> separate jiras.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.