We've been working on a Backup / Restore program Hbacker 
(https://github.com/rberger/hbacker) that uses the HBase Hadoop Export/Import 
jobs. 

Its pretty tuned to our use case (we just do appends, never delete, this allows 
us to do very simple incremental backups). Its still really rough and only 
appropriate for folks who might want to help us make it work. (Yeah, I've been 
talking about this for a long time, but finally got some help to get it 
finished)

The Export phase will write the tables to s3n which we thought is HBase version 
independent format. It also stores each Table's Column Description in a Mysql 
Db.

The Import phase uses the Table Column Description stored in MySQL to create 
the new table on the destination HBase Cluster.

One of our goals is to be able to do backups of our now ancient production 
HBase Cluster running 0.20.3 and use that backup to populate a shiny new 0.90 
cluster. The Export is run on the 0.20.3 cluster saves to S3. Then the import 
is run on the 0.90 HBase Cluster to import from the S3 files to HBase.

But in recent testing we had a problem where the import on 0.90 (CDH3 in Psuedo 
Distributed mode for testing)  where the Import gets thru 50% or so and then we 
get the following error on the CDH3 machine (Error log also at 
https://gist.github.com/1189897). Do we need to make some changes to the HBase 
0.20.3 Column Description before we use them to create the new table on 0.90? 
Or is it something completely different?

2011-09-02 19:47:43,605 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Started memstore flush for 
furtive_production_frylock_merchant_consumer_summary_e52c36e1-7851-08c1-bbdf-3fc2a84a1cb6,,1314992665740.8307caeae1de46237c1b85f80a92a0d0.,
 current region memstore size 64.9m
2011-09-02 19:47:43,606 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Finished snapshotting, commencing flushing stores
2011-09-02 19:47:43,907 FATAL 
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
serverName=10.2.119.218,60020,1314992502715, load=(requests=11461, regions=3, 
usedHeap=86, maxHeap=998): Replay of HLog required. Forcing server shutdown
org.apache.hadoop.hbase.DroppedSnapshotException: region:
furtive_production_frylock_merchant_consumer_summary_e52c36e1-7851-08c1-bbdf-3fc2a84a1cb6,,1314992665740.8307caeae1de46237c1b85f80a92a0d0.
       at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:995)
       at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:900)
       at 
org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:852)
       at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:392)
       at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:366)
       at 
org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:240)
Caused by: java.lang.IllegalArgumentException: No enum const class
org.apache.hadoop.hbase.regionserver.StoreFile$BloomType.0
       at java.lang.Enum.valueOf(Enum.java:196)
       at 
org.apache.hadoop.hbase.regionserver.StoreFile$BloomType.valueOf(StoreFile.java:90)

__________________
Robert J Berger - CTO
Runa Inc.
+1 408-838-8896
http://blog.ibd.com



Reply via email to