[jira] [Commented] (HDFS-3689) Add support for variable length block

Owen O'Malley (JIRA) Fri, 22 Aug 2014 11:04:19 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107194#comment-14107194
 ]


Owen O'Malley commented on HDFS-3689:
-------------------------------------

One follow up is that fixing MapReduce to use the actual block boundaries 
rather than dividing up the file in fixed size splits would not be difficult 
and would make the generated file splits for ORC and other block compressed 
files much much better. 

Furthermore, note that we could remove the need for lzo and zlib index files 
for text files by having TextOutputFormat cut the block at a line boundary and 
flush the compression codec. Thus TextInputFormat could divide the file at 
block boundaries and have them align at both a compression chunk boundary and a 
line break. That would be *great*.

> Add support for variable length block
> -------------------------------------
>
>                 Key: HDFS-3689
>                 URL: https://issues.apache.org/jira/browse/HDFS-3689
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, hdfs-client, namenode
>    Affects Versions: 3.0.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch
>
>
> Currently HDFS supports fixed length blocks. Supporting variable length block 
> will allow new use cases and features to be built on top of HDFS. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-3689) Add support for variable length block

Reply via email to