[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178044#comment-14178044
 ] 

Plamen Jeliazkov commented on HDFS-3107:
----------------------------------------

[~srivas], 
There is no plan to grow the file by padding it with zeroes as general-purpose 
truncate does. Both [~shv] and [~lei_chang] mentioned this in their design 
docs, I believe.

[~cmccabe],
While the copying the last block up to its truncate point and doing a 
delete/concat is definitely a simpler overall approach, the full truncate 
implementation has the benefit of being a single NameNode RPC call that can 
both truncate in-place and copy-on-truncate, preserving the original last block 
and moving the 'copy&truncate' work to the DataNodes themselves (as opposed to 
having to pass data through the network / client). I am not intending to debate 
either implementation -- I like both personally; just wanted to explain as 
briefly as I could why Konstantin and I are taking our approach.

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107.008.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> editsStored, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to