An update in case somebody else also stumble on this issue. The problem is fixed by applying patch HBASE-8413: https://issues.apache.org/jira/browse/HBASE-8413
On Sun, May 17, 2015 at 12:53 PM, lars hofhansl <[email protected]> wrote: > The latest version of 0.94 is 0.94.27. I doubt you'll get much help for > 0.94.7 here (it's two years and 20! releases ago) > Note that you can upgrade from 0.94.7 to 0.94.27 without down time (with a > rolling upgrade), but you'll have to build it from source yourself. > > -- Lars > From: Neutron sharc <[email protected]> > To: [email protected] > Sent: Friday, May 15, 2015 3:40 PM > Subject: hbase 0.94.7 snapshot problem > > Hi HBase community, > > I'm seeing a problem with hbase snapshot with 0.94.7 (CDH 4.2.0) > > When I manually run "snapshot <table name>, <snapshot name>" to take a > snapshot, I keep getting error about "Failed taking snapshot { > ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH } due to > exception:No region directory found for region {xyz...}". > > I tried move around the region at problem, but another region will see same > issue the next time. > > I tried a workaround (setting hbase.regionserver.ipc.address to 0.0.0.0) > suggested somewhere, but that doesn't work. (here is the link > > https://groups.google.com/a/cloudera.org/forum/#!topic/scm-users/B3fSsY6BgWI > ). > > > Below is an excerpt from master log: > > 2015-05-15 22:17:18,807 INFO > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Running > SKIPFLUSH table snapshot ss_rich_pin_data_v1 C_M_SNAPSHOT_TABLE on table > rich_pin_data_v1 > 2015-05-15 22:17:19,308 INFO org.apache.hadoop.hbase.procedure.Procedure: > Starting procedure 'ss_rich_pin_data_v1' > 2015-05-15 22:17:54,346 ERROR org.apache.hadoop.hbase.procedure.Procedure: > Procedure 'ss_rich_pin_data_v1' execution failed! > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via > timer-java.util.Timer@14004920 > :org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! > Source:Timeout caused Foreign Exception Start:1431728239316, > End:1431728274317, diff:35001, max:35000 ms > at > > org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:85) > at > > org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:369) > at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:208) > at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! > Source:Timeout caused Foreign Exception Start:1431728239316, > End:1431728274317, diff:35001, max:35000 ms > at > > org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:71) > at java.util.TimerThread.mainLoop(Timer.java:512) > at java.util.TimerThread.run(Timer.java:462) > 2015-05-15 22:17:54,347 INFO > org.apache.hadoop.hbase.procedure.ZKProcedureUtil: Clearing all znodes for > procedure ss_rich_pin_data_v1including nodes > /hbase/online-snapshot/acquired /hbase/online-snapshot/reached > /hbase/online-snapshot/abort > 2015-05-15 22:17:54,383 INFO > org.apache.hadoop.hbase.master.snapshot.EnabledTableSnapshotHandler: Done > waiting - snapshot for ss_rich_pin_data_v1 finished! > 2015-05-15 22:17:54,841 ERROR > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Failed taking > snapshot { ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH } > due to exception:No region directory found for region:{NAME => > 'rich_pin_data_v1,,1389326617112.081c4e6d88c46ff9be61b231b8ed2aca.', > STARTKEY => '', ENDKEY => '0030a5c15b50587297a8fa0bd585a12b', ENCODED => > 081c4e6d88c46ff9be61b231b8ed2aca,} > org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: No region > directory found for region:{NAME => > 'rich_pin_data_v1,,1389326617112.081c4e6d88c46ff9be61b231b8ed2aca.', > STARTKEY => '', ENDKEY => '0030a5c15b50587297a8fa0bd585a12b', ENCODED => > 081c4e6d88c46ff9be61b231b8ed2aca,} > at > > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegion(MasterSnapshotVerifier.java:167) > at > > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:152) > at > > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:115) > at > > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:156) > at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > 2015-05-15 22:17:54,841 INFO > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler: Stop taking > snapshot={ ss=ss_rich_pin_data_v1 table=rich_pin_data_v1 type=SKIPFLUSH } > because: Failed to take snapshot '{ ss=ss_rich_pin_data_v1 > table=rich_pin_data_v1 type=SKIPFLUSH }' due to exception > > > > Appreciate any help! > > > > > -Neutronsharc > > > >
