[ 
https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105733#comment-14105733
 ] 

Colin Patrick McCabe commented on HDFS-3689:
--------------------------------------------

bq. One thought is that we could identify runs of zeros fairly easily by 
looking at the checksums: an all-zero checksum chunk has a constant crc32 which 
we can compare for in a single instruction. The DN could relatively easily loop 
through the checksums of an incoming data packet, and verify whether it is all 
zeros, and if so, turn it into a sparse write.

Interesting idea.  This would allow us to automatically deal with long 
stretches of zeroes by creating sparse block files on the datanode.  Of course 
we have to check that the zero checksum really did come from a zeroed checksum 
chunk, rather than an unlikely coincidence.  I wonder if we could create sparse 
files without any new APIs this way...

> Add support for variable length block
> -------------------------------------
>
>                 Key: HDFS-3689
>                 URL: https://issues.apache.org/jira/browse/HDFS-3689
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, hdfs-client, namenode
>    Affects Versions: 3.0.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch
>
>
> Currently HDFS supports fixed length blocks. Supporting variable length block 
> will allow new use cases and features to be built on top of HDFS. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to