[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232725#comment-13232725 ]
Milind Bhandarkar commented on HDFS-3107: ----------------------------------------- This will be a great addition to HDFS for a couple of reasons: 1. Having an append without a truncate is a serious deficiency. 2. If a user mistakenly starts to append data to an existing large file, and discovers the mistake, the only recourse is to recreate that file, by rewriting the contents. This is very inefficient. > HDFS truncate > ------------- > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: data-node, name-node > Reporter: Lei Chang > Attachments: HDFS_truncate_semantics_Mar15.pdf > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira