[ https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436534#comment-13436534 ]
Tsz Wo (Nicholas), SZE commented on HADOOP-8239: ------------------------------------------------ - We cannot use EOFException in readFields(..) since it won't work if the serialized bytes is not at the end of the stream. How about we add a magic number in the new serialization so that it can distinguish the new and old serializations? For example, {code} //MD5MD5CRC32FileChecksum.java final int MAGIC = 0xABCDEF12; public void readFields(DataInput in) throws IOException { final int firstInt = in.readInt(); if (firstInt != MAGIC) { //old serialization: crcType is CRC32, firstInt is bytesPerCRC crcType = DataChecksum.Type.CRC32; bytesPerCRC = firstInt; crcPerBlock = in.readLong(); md5 = MD5Hash.read(in); } else { //new serialization includes crcType crcType = DataChecksum.Type.valueOf(in.readInt()); bytesPerCRC = in.readInt(); crcPerBlock = in.readLong(); md5 = MD5Hash.read(in); } } public void write(DataOutput out) throws IOException { if (crcType != DataChecksum.Type.CRC32) { //for non-CRC32 type, write magic and crcType out.writeInt(MAGIC); out.writeInt(crcType.id); } out.writeInt(bytesPerCRC); out.writeLong(crcPerBlock); md5.write(out); } {code} If we decide to do it, we should do the same for XML and JSON. - Need to update org.apache.hadoop.hdfs.web.JsonUtil. I guess you may already have plan to do it separately. > Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used > -------------------------------------------------------------------------- > > Key: HADOOP-8239 > URL: https://issues.apache.org/jira/browse/HADOOP-8239 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Fix For: 2.1.0-alpha > > Attachments: hadoop-8239-after-hadoop-8240.patch.txt, > hadoop-8239-before-hadoop-8240.patch.txt > > > In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended > to carry the information on the actual checksum type being used. The > interoperability between the extended version and branch-1 should be > guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or > httpfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira