Re: How to copy data in hdfs and hbase from a hadoop cluster into another hadoop cluster?

Pratyush Banerjee Thu, 24 Jul 2008 21:24:38 -0700

Hi,

There is a distributed copy utility in hadoop which would allow to copylarge chunks of data from one dfs to another. The exact syntax for usingthis command is


distcp [OPTIONS] <srcurl>* <desturl>

OPTIONS:
-p[rbugp]              Preserve status
                      r: replication number
                      b: block size
                      u: user
                      g: group
                      p: permission
                      -p alone is equivalent to -prbugp
-i                     Ignore failures
-log <logdir>          Write logs to <logdir>
-overwrite             Overwrite destination
-update                Overwrite if src size different from dst size
-f <urilist_uri>       Use list at <urilist_uri> as src list

NOTE: if -overwrite or -update are set, each source URI is
     interpreted as an isomorphic update to an existing directory.
For example:

hadoop distcp -p -update "hdfs://A:8020/user/foo/bar""hdfs://B:8020/user/foo/baz"


    would update all descendants of 'baz' also in 'bar'; it would
    *not* update /user/foo/baz/bar

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenod
-jt <local|jobtracker:port>    specify a job tracker

This utility utilizes map reduce to copy large chunks of data.

hope this helps.

Pratyush



[EMAIL PROTECTED] wrote:

Hi all,
    I have a large dataset saved in a hadoop cluster, and now I want
to copy these data from this hadoop cluster into another hadoop
cluster,  who can tell me how?
    Thank you very much !
    Best wishes !

maqiang

Re: How to copy data in hdfs and hbase from a hadoop cluster into another hadoop cluster?

Reply via email to