Repository: hadoop
Updated Branches:
refs/heads/branch-2.8 85a62dcb5 -> 55f7ceb0d
HDFS-9048. DistCp documentation is out-of-dated (Daisuke Kobayashi via
iwasakims)
(cherry picked from commit 33a412e8a4ab729d588a9576fb7eb90239c6e383)
Conflicts:
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/55f7ceb0
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/55f7ceb0
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/55f7ceb0
Branch: refs/heads/branch-2.8
Commit: 55f7ceb0db13a6ef7a29b54f63075ce05dc1b019
Parents: 85a62dc
Author: Masatake Iwasaki <[email protected]>
Authored: Thu Mar 3 18:57:23 2016 +0900
Committer: Masatake Iwasaki <[email protected]>
Committed: Thu Mar 3 18:58:59 2016 +0900
----------------------------------------------------------------------
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt | 2 ++
.../hadoop-distcp/src/site/markdown/DistCp.md.vm | 13 +++++++------
2 files changed, 9 insertions(+), 6 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/hadoop/blob/55f7ceb0/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
index 8991db5..bc232c8 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
+++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
@@ -1866,6 +1866,8 @@ Release 2.7.3 - UNRELEASED
HDFS-8791. block ID-based DN storage layout can be very slow for datanode
on ext4 (Chris Trezzo via kihwal)
+ HDFS-9048. DistCp documentation is out-of-dated
+ (Daisuke Kobayashi via iwasakims)
OPTIMIZATIONS
http://git-wip-us.apache.org/repos/asf/hadoop/blob/55f7ceb0/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
index 31f0444..0faa975 100644
--- a/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
+++ b/hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
@@ -412,12 +412,13 @@ $H3 Map sizing
$H3 Copying Between Versions of HDFS
- For copying between two different versions of Hadoop, one will usually use
- HftpFileSystem. This is a read-only FileSystem, so DistCp must be run on the
- destination cluster (more specifically, on NodeManagers that can write to the
- destination cluster). Each source is specified as
- `hftp://<dfs.http.address>/<path>` (the default `dfs.http.address` is
- `<namenode>:50070`).
+ For copying between two different major versions of Hadoop (e.g. between 1.X
+ and 2.X), one will usually use WebHdfsFileSystem. Unlike the previous
+ HftpFileSystem, as webhdfs is available for both read and write operations,
+ DistCp can be run on both source and destination cluster.
+ Remote cluster is specified as `webhdfs://<namenode_hostname>:<http_port>`.
+ When copying between same major versions of Hadoop cluster (e.g. between 2.X
+ and 2.X), use hdfs protocol for better performance.
$H3 MapReduce and other side-effects