[ 
https://issues.apache.org/jira/browse/HDFS-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554512#comment-13554512
 ] 

Suresh Srinivas commented on HDFS-4403:
---------------------------------------

bq. old client, new server: new server sets type. old client ignores it. Again, 
can't go back in time to change the old client, but nothing breaks.
If the new server sets crcType, the old client does not ignore it. It does use 
the crcType sent by the server. If we can ensure new server always sets the 
crcType, there is no issue.

Let me try to explain it better. May be I misunderstood the following comment: 
bq. Any new server always sets the checksum type explicitly, regardless of what 
type it is. It's only the old (pre-HDFS-3177) servers that wouldn't set one.
How do you ensure new server always sets the checksum type explicitly? What 
prevents a future (incorrect) change (say in 2.0.3) where server stops setting 
the crcType, because it is an optional field? In that case the old client (that 
had default value in proto definition) uses the default CRC type.

I believe, the correct change is to make the crcType field required. That way:
- old/new client talks to new server: new server always is forced to set the 
*required* field and old client/new clients use the crcType.
- new client talks to old server: new clients infers the checksum type by 
reading first byte
- old client and old server - cannot be solved as you described.

Making crcType required is not incompatible, given default value/explicitly set 
value was used by the old client.


                
> DFSClient can infer checksum type when not provided by reading first byte
> -------------------------------------------------------------------------
>
>                 Key: HDFS-4403
>                 URL: https://issues.apache.org/jira/browse/HDFS-4403
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.0.2-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>         Attachments: hdfs-4403.txt
>
>
> HDFS-3177 added the checksum type to OpBlockChecksumResponseProto, but the 
> new protobuf field is optional, with a default of CRC32. This means that this 
> API, when used against an older cluster (like earlier 0.23 releases) will 
> falsely return CRC32 even if that cluster has written files with CRC32C. This 
> can cause issues for distcp, for example.
> Instead of defaulting the protobuf field to CRC32, we can leave it with no 
> default, and if the OpBlockChecksumResponseProto has no checksum type set, 
> the client can send OP_READ_BLOCK to read the first byte of the block, then 
> grab the checksum type out of that response (which has always been present)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to