[jira] [Commented] (HDFS-3107) HDFS truncate

Colin Patrick McCabe (JIRA) Thu, 30 Oct 2014 15:21:13 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190942#comment-14190942
 ]


Colin Patrick McCabe commented on HDFS-3107:
--------------------------------------------

bq. So, but the new patch from Plamen Jeliazkov already have snapshot support

Am I looking at the wrong version?  The patch posted on 17 Oct does not seem to 
have snapshot support.  The relevant code is here:

{code}
2242      boolean truncateInternal(String src, long newLength,
2243                               String clientName, String clientMachine,
2244                               long mtime, FSPermissionChecker pc)
2245          throws IOException, UnresolvedLinkException {
2246        assert hasWriteLock();
2247        if (isPermissionEnabled) {
2248          checkPathAccess(pc, src, FsAction.WRITE);
2249        }
2250        INodesInPath iip = dir.getINodesInPath4Write(src, true);
2251        final int latestSnapshotId = iip.getLatestSnapshotId();
2252        INodeFile inodeFile = INodeFile.valueOf(iip.getLastINode(), src);
2253        // Data will be lost after truncate occurs so it cannot support 
snapshots.
2254        if(inodeFile.isInLatestSnapshot(latestSnapshotId))
2255          throw new HadoopIllegalArgumentException(
2256              "Cannot truncate file with snapshot.");
{code}

I think it's misleading to refer to throwing an exception when there are 
snapshots present as "support."  But whatever we call it, my patch doesn't 
throw an exception here-- it always allows truncation even when the admin is 
using snapshots.

I was trying to be helpful and point out an approach that would let you respond 
to the comments here.  For example, [~sureshms] 's comment here:
https://issues.apache.org/jira/browse/HDFS-3107?focusedCommentId=14155537&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14155537
  Or [~szetszwo]'s comment about the need to support rollback.  You need to 
address the comments that people have made.

P.S. Can we put numbers on patch files?  I find it difficult to keep track of 
them when they all have the same file name.

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107.008.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
> HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, 
> HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
> editsStored, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-3107) HDFS truncate

Reply via email to