Try to add these:
-Dhbase.client.scanner.caching=100 -Dmapred.map.tasks.speculative.execution=false Also, as others pointed out, what's the bandwidth between the two clusters? ________________________________ From: tobe <[email protected]> To: [email protected]; lars hofhansl <[email protected]> Sent: Thursday, August 14, 2014 11:24 PM Subject: Re: A better way to migrate the whole cluster? Thank @lars. We're using HBase 0.94.11 and follow the instruction to run `./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=hbase://cluster_name table_name`. We have namespace service to find the ZooKeeper with "hbase://cluster_name". And the job ran on a shared yarn cluster. The performance is affected by many factors, but we haven't found out the reason. It would be great to see your suggestions. On Fri, Aug 15, 2014 at 1:34 PM, lars hofhansl <[email protected]> wrote: > What version of HBase? How are you running CopyTable? A day for 1.8T is > not what we would expect. > You can definitely take a snapshot and then export the snapshot to another > cluster, which will move the actual files; but CopyTable should not be so > slow. > > > -- Lars > > > > ________________________________ > From: tobe <[email protected]> > To: "[email protected]" <[email protected]> > Cc: [email protected] > Sent: Thursday, August 14, 2014 8:18 PM > Subject: A better way to migrate the whole cluster? > > > Sometimes our users want to upgrade their servers or move to a new > datacenter, then we have to migrate the data from HBase. Currently we > enable the replication from the old cluster to the new cluster, and run > CopyTable to move the older data. > > It's a little inefficient. It takes more than one day to migrate 1.8T data > and more time to verify. Can we have a better way to do that, like snapshot > or purely HDFS files? > > And what's the best practise or your valuable experience? >
