[
https://issues.apache.org/jira/browse/HBASE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131089#comment-14131089
]
Matteo Bertozzi commented on HBASE-11954:
-----------------------------------------
please, post on the user list.
The problem here is that the region flush/communication is taking more than
60000sec which is the default timeout. you can bump the timeouts by setting
"hbase.snapshot.master.timeoutMillis" and "hbase.snapshot.region.timeout"
properties.
> create snapshot error
> ---------------------
>
> Key: HBASE-11954
> URL: https://issues.apache.org/jira/browse/HBASE-11954
> Project: HBase
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 0.98.2
> Reporter: aaron.shan
>
> When I want to create snapshot of a table, I get some exception like this:
> {code}
> hbase(main):004:0> snapshot 'booking', 'booking-snapshot-20140912'
> ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot {
> ss=booking-snapshot-20140912 table=booking type=FLUSH } had an error.
> Procedure booking-snapshot-20140912 {
> waiting=[hbase1.data.cn,60020,1407930968832,
> hbase45.data.cn,60020,1408609189376, hbase23.data.cn,60020,1407930978740,
> hbase37.data.cn,60020,1408608587411, hbase46.data.cn,60020,1408609190515,
> hbase6.data.cn,60020,1407930958926, hbase44.data.cn,60020,1408609188252,
> hbase7.data.cn,60020,1407930960021, hbase49.data.cn,60020,1408609193897,
> hbase47.data.cn,60020,1408609191647, hbase21.data.cn,60020,1407930976874,
> hbase39.data.cn,60020,1408608669063, hbase13.data.cn,60020,1407930966976,
> hbase15.data.cn,60020,1407930969235, hbase19.data.cn,60020,1407930973863,
> hbase16.data.cn,60020,1407930971152, hbase18.data.cn,60020,1407930972762,
> hbase43.data.cn,60020,1408609187126, hbase12.data.cn,60020,1407930966365,
> hbase10.data.cn,60020,1407930963512, hbase3.data.cn,60020,1407930955378,
> hbase11.data.cn,60020,1407930965112, hbase24.data.cn,60020,1407930979654,
> hbase2.data.cn,60020,1407930954308, hbase9.data.cn,60020,1407930962354,
> hbase38.data.cn,60020,1408608663894, hbase40.data.cn,60020,1408608674240,
> hbase41.data.cn,60020,1408609184867, hbase4.data.cn,60020,1407930956670,
> hbase36.data.cn,60020,1408608406292, hbase17.data.cn,60020,1407930972505,
> hbase35.data.cn,60020,1408607982898, hbase20.data.cn,60020,1407930974993,
> hbase48.data.cn,60020,1408609192763, hbase22.data.cn,60020,1407930978159,
> hbase8.data.cn,60020,1407930961333] done=[] }
> at
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:342)
> at
> org.apache.hadoop.hbase.master.HMaster.isSnapshotDone(HMaster.java:2905)
> at
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40494)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
> at
> org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by:
> org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via
> timer-java.util.Timer@69db0cb4:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
> org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed!
> Source:Timeout caused Foreign Exception Start:1410453067992,
> End:1410453127992, diff:60000, max:60000 ms
> at
> org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
> at
> org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:320)
> at
> org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:332)
> ... 10 more
> Caused by:
> org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
> org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed!
> Source:Timeout caused Foreign Exception Start:1410453067992,
> End:1410453127992, diff:60000, max:60000 ms
> at
> org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:70)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> {code}
> I find the solution by google, and somebody say it maybe caused by the flush
> snapshot attempting to take a region lock. See
> [HBASE-7703|https://issues.apache.org/jira/browse/HBASE-7703]. But this
> exception has different features.
> After I flush the table, it success to create snapshot.
> {code}
> hbase(main):005:0> flush 'booking'
> 0 row(s) in 4.5220 seconds
> hbase(main):006:0> snapshot 'booking', 'booking-snapshot-20140912'
> 0 row(s) in 4.1270 seconds
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)