[
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437733#comment-13437733
]
Tsz Wo (Nicholas), SZE commented on HADOOP-8239:
------------------------------------------------
> One approach I can think of is to leave the current readFields()/write()
> methods unchanged. I think only WebHdfs is using it and if that is true, we
> can make WebHdfs actually send and receive everything in JSON format and keep
> the current "bytes" Json field as is.
FileChecksum is designed to support different kinds of checksum algorithms so
that it has the following abstract methods
{code}
public abstract String getAlgorithmName();
public abstract int getLength();
public abstract byte[] getBytes();
{code}
[WebHDFS FileChecksum
schema|http://hadoop.apache.org/common/docs/r1.0.0/webhdfs.html#FileChecksum]
has fields corresponding to these methods.
With FileChecksum, clients like WebHDFS could obtain the corresponding checksum
by first getting the checksum algorithm name and then computing the bytes. If
we add MD5MD5CRC32FileChecksum specific fields to the JSON format, then it is
harder to support other algorithms and harder to specify the WebHDFS API since
we have to specify the cases for each algorithm in the API.
For our tasks here, we are actually adding new algorithms as we have to change
the algorithm name for different CRC types. So, we may as well add new classes
to handle them instead of changing MD5MD5CRC32FileChecksum. BTW, the name
"MD5MD5CRC32FileChecksum" is not suitable for the other crc type because it has
"CRC32". Thought?
> Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
> --------------------------------------------------------------------------
>
> Key: HADOOP-8239
> URL: https://issues.apache.org/jira/browse/HADOOP-8239
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Fix For: 2.1.0-alpha
>
> Attachments: hadoop-8239-after-hadoop-8240.patch.txt,
> hadoop-8239-after-hadoop-8240.patch.txt,
> hadoop-8239-before-hadoop-8240.patch.txt,
> hadoop-8239-before-hadoop-8240.patch.txt
>
>
> In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended
> to carry the information on the actual checksum type being used. The
> interoperability between the extended version and branch-1 should be
> guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or
> httpfs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira