[ 
https://issues.apache.org/jira/browse/HADOOP-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488739
 ] 

Doug Cutting commented on HADOOP-1259:
--------------------------------------

> I think the checksum subsystem in dfs should be able to deal with arbitrary 
> block sizes.

This will complicate implementation, since checksums will not be aligned with 
blocks.  But we might need to support that anyway when we add a file-append 
feature.  If appends work by simply adding new blocks and not changing existing 
blocks, then the final checksum of any block might be for fewer bytes than 
bytesPerChecksum (like that of the last block today).

So, if we decide to support this (arbitrary block sizes and bytesPerChecksum in 
HDFS) then I suggest we implement it this way: the final checksum in each block 
is permitted to be for fewer bytes than bytesPerChecksum.

Note that this will not help us with upgrading non-aligned checksums 
(http://issues.apache.org/jira/browse/HADOOP-1134#action_12488542).


> DFS should enforce block size is a multiple of io.bytes.per.checksum 
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-1259
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1259
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Raghu Angadi
>
> DFSClient currently does not enforce that dfs.block.size is a multiple 
> io.bytes.per.checksum. This not really problem currently but can future 
> upgrades like HADOOP-1134 (see one of the comments 
> http://issues.apache.org/jira/browse/HADOOP-1134#action_12488542 there). 
> I propose DFSClient should fail loudly and ask the user politely to change 
> the config to meet this conidtion. Of course we will change the documentation 
> for dfs.block.size also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to