[
https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105733#comment-14105733
]
Colin Patrick McCabe commented on HDFS-3689:
--------------------------------------------
bq. One thought is that we could identify runs of zeros fairly easily by
looking at the checksums: an all-zero checksum chunk has a constant crc32 which
we can compare for in a single instruction. The DN could relatively easily loop
through the checksums of an incoming data packet, and verify whether it is all
zeros, and if so, turn it into a sparse write.
Interesting idea. This would allow us to automatically deal with long
stretches of zeroes by creating sparse block files on the datanode. Of course
we have to check that the zero checksum really did come from a zeroed checksum
chunk, rather than an unlikely coincidence. I wonder if we could create sparse
files without any new APIs this way...
> Add support for variable length block
> -------------------------------------
>
> Key: HDFS-3689
> URL: https://issues.apache.org/jira/browse/HDFS-3689
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, hdfs-client, namenode
> Affects Versions: 3.0.0
> Reporter: Suresh Srinivas
> Assignee: Suresh Srinivas
> Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch
>
>
> Currently HDFS supports fixed length blocks. Supporting variable length block
> will allow new use cases and features to be built on top of HDFS.
--
This message was sent by Atlassian JIRA
(v6.2#6252)