[
https://issues.apache.org/jira/browse/HADOOP-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773109#action_12773109
]
Todd Lipcon commented on HADOOP-3205:
-------------------------------------
Hi Hong,
Thanks for the input. How do others feel about using a separate CRC32 path for
the "bulk checksum checking" in the read path, probably through JNI when
available? I had suggested this in HADOOP-6148 and people said it would be
unmaintainable. Given that checksum algorithms rarely change and are easy to
verify, I disagree, but would like to have some +1s for this direction before I
spend the time writing the code.
Regarding the other points in your blog post, it seems to imply that we'd have
to change around a lot of the APIs to work with ByteBuffers rather than byte[],
potentially all the way down to the user-facing layer. This would be a big API
change. Where is a good place to start, and what kind of backwards
compatibility layer will we need?
> FSInputChecker and FSOutputSummer should allow better access to user buffer
> ---------------------------------------------------------------------------
>
> Key: HADOOP-3205
> URL: https://issues.apache.org/jira/browse/HADOOP-3205
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
>
> Implementations of FSInputChecker and FSOutputSummer like DFS do not have
> access to full user buffer. At any time DFS can access only up to 512 bytes
> even though user usually reads with a much larger buffer (often controlled by
> io.file.buffer.size). This requires implementations to double buffer data if
> an implementation wants to read or write larger chunks of data from
> underlying storage.
> We could separate changes for FSInputChecker and FSOutputSummer into two
> separate jiras.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.