[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127371#comment-14127371 ]
Konstantin Shvachko commented on HDFS-3107: ------------------------------------------- I see your point. I'll let Plamen speak about current state of the art. Let's talk how it should be. # The documentation on snapshots explicetly states "there is no data copying". So may be copy-on-write is not not desirable here although appealing. # Another way is not to remove data during truncate if the file is in a snapshot. Just reduce the length, and deal with block removal / truncation when snapshot is removed. Sort of symmetrical to append. # The simplest is to disallow truncate on files that are in a snapshot, as you indicated. May be we should do this first and add one of the above when a use case emerges? > HDFS truncate > ------------- > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode > Reporter: Lei Chang > Assignee: Plamen Jeliazkov > Attachments: HDFS_truncate_semantics_Mar15.pdf, > HDFS_truncate_semantics_Mar21.pdf > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)