[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko reassigned HDFS-3107:
-----------------------------------------

    Assignee: Plamen Jeliazkov

Nicholas in [his 
comment|https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=13235941&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13235941]
 proposed three approaches to implement truncate. Here is another one, which 
was mentioned in [this 
comment|https://issues.apache.org/jira/browse/HDFS-6087?focusedCommentId=13948814&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13948814]
 of HDFS-6087.
Conceptually, truncate removes all full blocks and then starts a recovery 
process for the last block which is not fully truncated. The truncate recovery 
is similar to lease recovery. That is, NN sends truncate-DatanodeCommand to one 
of the DNs containing block replicas. The primary DN synchronizes the new 
length between replicas, and then sends commitBlockSynchronization() to NN, 
which completes the truncate.
Truncate will work only for closed files. If the file is opened for write an 
attempt to truncate fails.

Here are the truncate steps in more details:
- NN receives a truncate(src, newLength) call from a client.
- Full blocks are deleted instantaneously. And if there is nothing more to 
truncate NN returns success to the client.
- If not on the block boundary, then NN converts INode to 
INodeUnderConstruction and set file length to newLength.
- The last blocks state is set to BEING_TRUNCATED.
- Truncate operation is persisted in editLog.
- NN triggers last block length recovery by sending DatanodeCommand and waits 
for the DN to report back.
- File remains UNDER_RECOVERY until the recovery completes.
- Lease expiration (soft or hard) will trigger last block recovery for truncate.
- If NN restarts it will restart the recovery

Assigning to Plamen, he seems to be almost ready with the patch.

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar21.pdf
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to