Cluster migration best practices
--------------------------------
Key: HBASE-3451
URL: https://issues.apache.org/jira/browse/HBASE-3451
Project: HBase
Issue Type: Brainstorming
Affects Versions: 0.89.20100924, 0.20.6
Reporter: Daniel Einspanjer
Priority: Critical
Mozilla is currently in the process of trying to migrate our HBase cluster to a
new datacenter.
We have our existing 25 node cluster in our SJC datacenter. It is serving
production traffic 24/7. While we can take downtimes, it is very costly and
difficult to take them for more than a few hours in the evening.
We have two new 30 node clusters in our PHX datacenter. We are wanting to cut
production over to one of these this week.
The old cluster is running 0.20.6. The new clusters are running CDH3b3 with
HBase 0.89.
We have tried running a pull distcp using hftp URLs. If HBase is running, this
causes SAX XML Parsing exceptions when a directory is removed during the scan.
If HBase is stopped, it takes hours for the directory compare to finish before
it even begins copying data.
We have tried a custom backup MR job. This job uses the map phase to evaluate
and copy changed files. It can run while HBase is live, but that results in a
dirty copy of the data.
We have tried running the custom backup job while HBase is shut down as well.
When we do this, even on two back to back runs, it still copies over some data
and seems to not be an entirely clean copy.
When we have gotten what we thought was an entire copy onto the new cluster, we
ran add_table on it, but the resulting hbase table had holes. Investigating
the holes revealed there were directories that were not transfered.
We had a meeting to brainstorm ideas and two further suggestions that came up
were:
1. Build a file list of files to transfer on the SJC side, transfer that file
list to PHX and then run distcp on it.
2. Try a full copy instead of incremental, skipping the expensive file compare
step
3. Evaluate copying from SJC to S3 then from S3 to PHX.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.