Bryan: Thanks for sharing your valuable experience. bq. the diff function should be changed to inspect the actual HFiles
Inspecting / comparing metadata for actual HFiles should help. On Fri, Aug 15, 2014 at 11:27 AM, Bryan Beaudreault < [email protected]> wrote: > The reason it swallows exceptions is because the idea is that you first run > a diff, then use the output of that diff as the input of the Backup job. > So if a HFile fails to copy, the job will still finish, copying most of > the HFiles. Then you run the diff again, and it will see that an HFile was > missed. The next run will only copy that file and any other new files. > > You run the diff before each job run. The diff is basically run `hdfs dfs > -ls -R /hbase` on each cluster, and pass the output of those commands into > > https://github.com/mozilla-metrics/akela/blob/master/src/main/python/lsr_diff.py > . > > So the way we've used this job (modified internally, like I said, but very > much the same concept) is: > > 1) Run the job, usually takes a while > 2) Run the job again. This takes much shorter because most of the files > were copied and we're just copying those new or failed files since the last > job began. > 3) Depending on how long the job is taking, i.e. how fast the data > ingestion is, we'll run it as many times as we need to get the window > small. > 4) Stop the source cluster > 5) Run the job one more time > 6) If there was an exception, run it again. I've had to do this maybe > once. > 6) Start the target cluster, and bounce applications so they connect to the > new cluster. > > The idea is you can run the job over and over to get closer to being > in-sync. This is how we handled errors too. In the times I've used it > (10+ on multiple production clusters), I've never seen an exception that > couldn't be resolved by running it again. > > So we previously worked with the leniency of having a short window (<10 > mins) of downtime. There are some challenges with doing this in an online > migration with replication, but maybe still doable with some thought around > the diffing. For example, compactions will never stop in a running source > cluster, so currently we might have to keep running the Backup over and > over, never reaching parity. So maybe the diff function should be changed > to inspect the actual HFiles instead of just comparing file name and > length. > > > > On Fri, Aug 15, 2014 at 1:47 PM, Ted Yu <[email protected]> wrote: > > > Bryan: > > From javadoc of Backup.java: > > bq. it favors swallowing exceptions and incrementing counters as opposed > to > > failing > > > > Can you share some experience how you handled the errors reported by > Backup > > ? > > > > Thanks > > > > > > On Fri, Aug 15, 2014 at 10:38 AM, Bryan Beaudreault < > > [email protected]> wrote: > > > > > I agree it would be nice if this was provided by HBase, but it's > already > > > possible to work straight with the HFiles. All you need is a custom > > hadoop > > > job. A good starting point is > > > > > > > > > https://github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/hadoop/Backup.java > > > and modify it to your needs. We've used our own modification of this > job > > > many times when we do our own cluster migrations. The idea is that it > is > > > incremental, so as HFiles get compacted, deleted, etc, you can just run > > it > > > again and move smaller and smaller amounts of data. > > > > > > Working at the hdfs level should be faster, as you can use more > mappers. > > > You will still be taxing the IO of the source cluster, but not adding > > load > > > to the actual regionserver processes (ipc queue, memory, etc). > > > > > > If you upgrade to CDH5 (or the equivalent hdfs version), you can use > hdfs > > > snapshots to minimize the need to re-run the above Backup job (since > you > > > are already using replication to keep data up-to-date). > > > > > > > > > On Fri, Aug 15, 2014 at 1:11 PM, Esteban Gutierrez < > [email protected] > > > > > > wrote: > > > > > > > 1.8TB in a day is not terrible slow if that number comes from the > > > CopyTable > > > > counters and you are moving data across data centers using public > > > networks, > > > > that should be about 20MB/sec. Also, CopyTable won't compress > anything > > on > > > > the wire so the network overhead should be a lot. If you use anything > > > like > > > > snappy for block compression and/or fast_diff for block encoding the > > > > HFiles, then using snapshots and export them using the ExportSnapshot > > > tool > > > > should be the way to go. > > > > > > > > cheers, > > > > esteban. > > > > > > > > > > > > > > > > -- > > > > Cloudera, Inc. > > > > > > > > > > > > > > > > On Thu, Aug 14, 2014 at 11:24 PM, tobe <[email protected]> > wrote: > > > > > > > > > Thank @lars. > > > > > > > > > > We're using HBase 0.94.11 and follow the instruction to run > > > `./bin/hbase > > > > > org.apache.hadoop.hbase.mapreduce.CopyTable > > > > --peer.adr=hbase://cluster_name > > > > > table_name`. We have namespace service to find the ZooKeeper with > > > > > "hbase://cluster_name". And the job ran on a shared yarn cluster. > > > > > > > > > > The performance is affected by many factors, but we haven't found > out > > > the > > > > > reason. It would be great to see your suggestions. > > > > > > > > > > > > > > > On Fri, Aug 15, 2014 at 1:34 PM, lars hofhansl <[email protected]> > > > wrote: > > > > > > > > > > > What version of HBase? How are you running CopyTable? A day for > > 1.8T > > > is > > > > > > not what we would expect. > > > > > > You can definitely take a snapshot and then export the snapshot > to > > > > > another > > > > > > cluster, which will move the actual files; but CopyTable should > not > > > be > > > > so > > > > > > slow. > > > > > > > > > > > > > > > > > > -- Lars > > > > > > > > > > > > > > > > > > > > > > > > ________________________________ > > > > > > From: tobe <[email protected]> > > > > > > To: "[email protected]" <[email protected]> > > > > > > Cc: [email protected] > > > > > > Sent: Thursday, August 14, 2014 8:18 PM > > > > > > Subject: A better way to migrate the whole cluster? > > > > > > > > > > > > > > > > > > Sometimes our users want to upgrade their servers or move to a > new > > > > > > datacenter, then we have to migrate the data from HBase. > Currently > > we > > > > > > enable the replication from the old cluster to the new cluster, > and > > > run > > > > > > CopyTable to move the older data. > > > > > > > > > > > > It's a little inefficient. It takes more than one day to migrate > > 1.8T > > > > > data > > > > > > and more time to verify. Can we have a better way to do that, > like > > > > > snapshot > > > > > > or purely HDFS files? > > > > > > > > > > > > And what's the best practise or your valuable experience? > > > > > > > > > > > > > > > > > > > > >
