Repository: hadoop Updated Branches: refs/heads/branch-2 eaeaf80d3 -> 5c2c6b00d
HDFS-7752. Improve description for "dfs.namenode.num.extra.edits.retained" and "dfs.namenode.num.checkpoints.retained" properties on hdfs-default.xml. Contributed by Wellington Chevreuil. (cherry picked from commit b9a17909ba39898120a096cb6ae90104640690db) Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/5c2c6b00 Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/5c2c6b00 Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/5c2c6b00 Branch: refs/heads/branch-2 Commit: 5c2c6b00dd35ce422dccfbbfff77a3933d93f33b Parents: eaeaf80 Author: Harsh J <ha...@cloudera.com> Authored: Fri Feb 20 19:20:41 2015 +0530 Committer: Harsh J <ha...@cloudera.com> Committed: Fri Feb 20 19:21:34 2015 +0530 ---------------------------------------------------------------------- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt | 5 +++++ .../hadoop-hdfs/src/main/resources/hdfs-default.xml | 15 +++++++++++---- 2 files changed, 16 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c2c6b00/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt index 40196b3..363ddca 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt +++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt @@ -39,6 +39,11 @@ Release 2.7.0 - UNRELEASED IMPROVEMENTS + HDFS-7752. Improve description for + "dfs.namenode.num.extra.edits.retained" + and "dfs.namenode.num.checkpoints.retained" properties on + hdfs-default.xml (Wellington Chevreuil via harsh) + HDFS-7055. Add tracing to DFSInputStream (cmccabe) HDFS-7186. Document the "hadoop trace" command. (Masatake Iwasaki via Colin http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c2c6b00/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml index bb28f01..2981db2 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml @@ -852,9 +852,9 @@ <property> <name>dfs.namenode.num.checkpoints.retained</name> <value>2</value> - <description>The number of image checkpoint files that will be retained by + <description>The number of image checkpoint files (fsimage_*) that will be retained by the NameNode and Secondary NameNode in their storage directories. All edit - logs necessary to recover an up-to-date namespace from the oldest retained + logs (stored on edits_* files) necessary to recover an up-to-date namespace from the oldest retained checkpoint will also be retained. </description> </property> @@ -863,8 +863,15 @@ <name>dfs.namenode.num.extra.edits.retained</name> <value>1000000</value> <description>The number of extra transactions which should be retained - beyond what is minimally necessary for a NN restart. This can be useful for - audit purposes or for an HA setup where a remote Standby Node may have + beyond what is minimally necessary for a NN restart. + It does not translate directly to file's age, or the number of files kept, + but to the number of transactions (here "edits" means transactions). + One edit file may contain several transactions (edits). + During checkpoint, NameNode will identify the total number of edits to retain as extra by + checking the latest checkpoint transaction value, subtracted by the value of this property. + Then, it scans edits files to identify the older ones that don't include the computed range of + retained transactions that are to be kept around, and purges them subsequently. + The retainment can be useful for audit purposes or for an HA setup where a remote Standby Node may have been offline for some time and need to have a longer backlog of retained edits in order to start again. Typically each edit is on the order of a few hundred bytes, so the default