[ 
https://issues.apache.org/jira/browse/HADOOP-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436534#comment-13436534
 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-8239:
------------------------------------------------

- We cannot use EOFException in readFields(..) since it won't work if the 
serialized bytes is not at the end of the stream.  How about we add a magic 
number in the new serialization so that it can distinguish the new and old 
serializations?  For example,
{code}
//MD5MD5CRC32FileChecksum.java
  final int MAGIC = 0xABCDEF12;

  public void readFields(DataInput in) throws IOException {
    final int firstInt = in.readInt();
    if (firstInt != MAGIC) {
      //old serialization: crcType is CRC32, firstInt is bytesPerCRC 
      crcType = DataChecksum.Type.CRC32;
      bytesPerCRC = firstInt;
      crcPerBlock = in.readLong();
      md5 = MD5Hash.read(in);
    } else {
      //new serialization includes crcType
      crcType = DataChecksum.Type.valueOf(in.readInt());
      bytesPerCRC = in.readInt();
      crcPerBlock = in.readLong();
      md5 = MD5Hash.read(in);
    }
  }

  public void write(DataOutput out) throws IOException {
    if (crcType != DataChecksum.Type.CRC32) {
      //for non-CRC32 type, write magic and crcType
      out.writeInt(MAGIC);
      out.writeInt(crcType.id);
    }
    out.writeInt(bytesPerCRC);
    out.writeLong(crcPerBlock);
    md5.write(out);
  }
{code}
If we decide to do it, we should do the same for XML and JSON.


- Need to update org.apache.hadoop.hdfs.web.JsonUtil.  I guess you may already 
have plan to do it separately.
                
> Extend MD5MD5CRC32FileChecksum to show the actual checksum type being used
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-8239
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8239
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>             Fix For: 2.1.0-alpha
>
>         Attachments: hadoop-8239-after-hadoop-8240.patch.txt, 
> hadoop-8239-before-hadoop-8240.patch.txt
>
>
> In order to support HADOOP-8060, MD5MD5CRC32FileChecksum needs to be extended 
> to carry the information on the actual checksum type being used. The 
> interoperability between the extended version and branch-1 should be 
> guaranteed when Filesystem.getFileChecksum() is called over hftp, webhdfs or 
> httpfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to