[
https://issues.apache.org/jira/browse/HDFS-15829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284072#comment-17284072
]
Yang Yun commented on HDFS-15829:
---------------------------------
Update to HDFS-15829.001.patch for checkstyle issue.
> Use xattr to support HDFS TTL on Observer namenode
> --------------------------------------------------
>
> Key: HDFS-15829
> URL: https://issues.apache.org/jira/browse/HDFS-15829
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: dfsclient, namenode
> Reporter: Yang Yun
> Assignee: Yang Yun
> Priority: Minor
> Attachments: HDFS-15829.001.patch, HDFS-15829.patch
>
>
> h3. Overview
>
> HDFS TTL is implemented using the xattr mechanism provided by HDFS. When a
> user sets a TTL to a file or directory, HDFS creates an xattr named "ttl" for
> the file or directory, and stores the value set by the user in this xattr. A
> service called TtlService runs on HDFS standby or Observer(Recommended ). It
> scans the in-memony inode map regularly, reads the value of xattr "ttl" from
> each INode, and calculates whether the ttl has expired. If so, it will get
> the full file path from Inode and add it to expired file list. After scan it
> will create a DFSClient and delete the expired file list in bach. other
> option is to trigger a Yarn job to delete them in parallel。
> h3. Protocol
> Add two xattr
> "user.ttl": value of TTL by minutes, identify the time that file or folder
> will be expired.
> "user. ttlproperty": value is TTL types, including,
> * SINCELASTWRITE = 0x1 # caculate the TTL from last writing.
> * KEEPEMPTYDIR = 0x2; # if keep the empty dir
> * KEEPEMPTYSUBDIR = 0x4; # if keep subdir empty.
>
> *Nested TTL*
> TTL supports setting for each directory and file on a path, so that after
> setting, the setting of the lower-level subdirectory or file will take
> effect. If a directory or file does not have a time to live, it will inherit
> the settings of the nearest ancestor directory. The following is an
> illustrative example. Suppose there is such a directory tree:
>
> {code:java}
> /A/B/E
> /A/C
> /A/D {code}
>
> That is, B, C and D under directory A. And there is file E under directory
> B. Suppose the user sets the TTL of A to 2 days, the TTL of B to 3 days, the
> TTL of E to 1 day, and the TTL of C and D is not set. Then the file E will be
> cleared after 1 day. After 2 days, C and D will be cleared. The settings
> inherited from directory A are used here. Please note that at this time,
> directory A will not be cleared because it is not empty. After 3 days, B will
> be cleared because its own settings expire. After B is cleared, because A’s
> settings have already expired and A has become an empty directory, it will
> also be cleared.
> h3. Usage
> Fro the first version, provide API to set the TTL, will add comand line
> later.
>
> {code:java}
> /**
> * Set TTL to a file.
> * @param fs the file system.
> * @param path the target file to set TTL.
> * @param path the TTL value.
> * @param property the type of TTL.
> * @throws IOException
> */
> public static void setTTl(FileSystem fs, Path path, int value, int property)
> {code}
> h3. Example
>
> {code:java}
> TtlInfo.setTTl(fs, file, System.currentTimeMillis() / 1000 / 60 + 60, 0);
> #The file will be expired in an 60 minutes.
> TtlInfo.setTTl(fs, file, 60, TtlInfo.SINCELASTWRITE); #The file will be
> expired after 60 minutes since last write.{code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]