Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/HowToMigrate

------------------------------------------------------------------------------
  
  You can only migrate to 0.20.x from 0.19.x.  If you have an earlier hbase, 
you will need to install 0.19, migrate your old instance, and then install 
0.20.x.
  
+ This migration rewrites all data.  It will take a while.  TODO: MR version 
and single-threaded version. How-to.
+ 
+ ==== Preparing for Migration ====
+ 
+ You to do a few things first before you can begin migration.
+ 
+ ===== Can you back up your data? =====
+ Migration has been tested but if you have sufficient space in hdfs to make a 
copy of your hbase rootdir, do so.  Just in case.  Use hdfs distcp.
+ 
+ ===== Major Compacting all Tables =====
+ Before you begin, run a major compaction on all tables including .META. 
table.  To major compact from the shell, hbase must be running.  For example, 
the below cluster has only one table named 'a'.  See how we run a 
major_compaction on each:
+ 
+ {{{st...@connelly:~/checkouts/hbase/branches/0.19$ ./bin/hbase shell
+ HBase Shell; enter 'help<RETURN>' for list of supported commands.
+ Version: 0.19.4, r781868, Tue Jul 14 11:27:58 PDT 2009
+ hbase(main):001:0> list
+ a                                                                             
                                 
+ 2 row(s) in 0.1251 seconds
+ hbase(main):002:0> major_compact 'a'
+ 0 row(s) in 0.0400 seconds
+ hbase(main):003:0> major_compact '.META.'
+ 0 row(s) in 0.0245 seconds
+ hbase(main):004:0> major_compact '-ROOT-'
+ 0 row(s) in 0.0173 seconds}}}
+ 
+ In the above, the compaction took no time.  The case will likely be different 
for you if you have big tables.  The way to confirm that the major compaction 
completed is to do a listing of the hbase rootdir in hdfs.  For each region on 
the filesystem, each of its stores should have one mapfile only if major 
compaction succeeded.  For example, below we list whats under the 'a' table 
directory under the hbase rootdir:
+ 
+ {{{/tmp/hbase-stack/hbase/a
+ /tmp/hbase-stack/hbase/a/1833721875
+ /tmp/hbase-stack/hbase/a/1833721875/a
+ /tmp/hbase-stack/hbase/a/1833721875/a/info
+ /tmp/hbase-stack/hbase/a/1833721875/a/info/8167759949199600085
+ /tmp/hbase-stack/hbase/a/1833721875/a/info/.8167759949199600085.crc
+ /tmp/hbase-stack/hbase/a/1833721875/a/mapfiles
+ /tmp/hbase-stack/hbase/a/1833721875/a/mapfiles/8167759949199600085
+ /tmp/hbase-stack/hbase/a/1833721875/a/mapfiles/8167759949199600085/data
+ /tmp/hbase-stack/hbase/a/1833721875/a/mapfiles/8167759949199600085/.data.crc
+ /tmp/hbase-stack/hbase/a/1833721875/a/mapfiles/8167759949199600085/.index.crc
+ /tmp/hbase-stack/hbase/a/1833721875/a/mapfiles/8167759949199600085/index}}}
+ 
+ There is one column family in this table named 'a' (unfortunately).  The 
table has one region whose encoded name is 1833721875.  Under this region 
directory, there are the info -- for metadata -- and mapfile directories.  
There is only one mapfile in our case above, named 8167759949199600085 
(MapFiles are made of data and index files).
+ 
+ You cannot migrate unless all has been major compacted first.  Major 
compaction is necessary because the way deletes work changed in 0.20 hbase.
+ 
+ -ROOT- and .META. flush frequently so could mess up your nice and tidy 
single-file per store major_compacted hbase layout.  They won't flush if there 
have not been edits.  So, make sure your cluster is not taking writes and 
hasn't been doing so for a good while before starting up the major compaction 
process.
+ 
+ TODO: Command-line tool to major compact an offline cluster.
+ 
+ ==== Migrating ====
  
  
  === From 0.1.x to 0.2.x or 0.18.x ===

Reply via email to