[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Shvachko reassigned HDFS-3107: ----------------------------------------- Assignee: Plamen Jeliazkov Nicholas in [his comment|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=13235941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13235941] proposed three approaches to implement truncate. Here is another one, which was mentioned in [this comment|https://issues.apache.org/jira/browse/HDFS-6087?focusedCommentId=13948814&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13948814] of HDFS-6087. Conceptually, truncate removes all full blocks and then starts a recovery process for the last block which is not fully truncated. The truncate recovery is similar to lease recovery. That is, NN sends truncate-DatanodeCommand to one of the DNs containing block replicas. The primary DN synchronizes the new length between replicas, and then sends commitBlockSynchronization() to NN, which completes the truncate. Truncate will work only for closed files. If the file is opened for write an attempt to truncate fails. Here are the truncate steps in more details: - NN receives a truncate(src, newLength) call from a client. - Full blocks are deleted instantaneously. And if there is nothing more to truncate NN returns success to the client. - If not on the block boundary, then NN converts INode to INodeUnderConstruction and set file length to newLength. - The last blocks state is set to BEING_TRUNCATED. - Truncate operation is persisted in editLog. - NN triggers last block length recovery by sending DatanodeCommand and waits for the DN to report back. - File remains UNDER_RECOVERY until the recovery completes. - Lease expiration (soft or hard) will trigger last block recovery for truncate. - If NN restarts it will restart the recovery Assigning to Plamen, he seems to be almost ready with the patch. > HDFS truncate > ------------- > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode > Reporter: Lei Chang > Assignee: Plamen Jeliazkov > Attachments: HDFS_truncate_semantics_Mar15.pdf, > HDFS_truncate_semantics_Mar21.pdf > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)