Re: snapshot timeouts

Brian Jeltema Wed, 08 Oct 2014 12:28:19 -0700

Sorry, I usually include that info. HBase version is 0.98. hbase.rpc.timeout is 
the default.


When the ‘ERROR: Call id….’ occurred, there was no stack trace. That was the 
entire error output.

Before I increased the snapshot timeout parameters, the timeout I was seeing 
looked like:

ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { 
ss=Host-bdj table=Host type=FLUSH } had an error.  Procedure Host-bdj { 
waiting=[] done=[host-22.hdfs.foo.net,60020,1410543068459, 
host-24.hdfs.foo.net,60020,1412603246174, 
host-17.hdfs.foo.net,60020,1410543059186, 
host-19.hdfs.foo.net,60020,1412419924491, 
host-20.hdfs.foo.net,60020,1412419942143, 
host-16.hdfs.foo.net,60020,1403178964733, 
host-15.hdfs.foo.net,60020,1403178962029, 
host-21.hdfs.foo.net,60020,1403178959748, 
host-23.hdfs.foo.net,60020,1410543079248, 
host-18.hdfs.foo.net,60020,1410543061865] }
        at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:366)
        at 
org.apache.hadoop.hbase.master.HMaster.isSnapshotDone(HMaster.java:2993)
        at 
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38245)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
        at 
org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.hbase.errorhandling.TimeoutException via 
timer-java.util.Timer@3097c4e1:org.apache.hadoop.hbase.errorhandling.TimeoutException:
 Timeout elapsed! Source:Timeout caused Foreign Exception Start:1412792382137, 
End:1412792442137, diff:60000, max:60000 ms
        at 
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
        at 
org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:318)
        at 
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:356)
        ... 10 more
Caused by: org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout 
elapsed! Source:Timeout caused Foreign Exception Start:1412792382137, 
End:1412792442137, diff:60000, max:60000 ms
        at 
org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:67)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)

On Oct 8, 2014, at 3:18 PM, Ted Yu <[email protected]> wrote:

> Can you give a bit more information :
> 
> the release of hbase you're using
> value for hbase.rpc.timeout (looks like you leave it @ default)
> more of the error (please include stack trace if possible)
> 
> Cheers
> 
> On Wed, Oct 8, 2014 at 12:09 PM, Brian Jeltema <
> [email protected]> wrote:
> 
>> I’m trying to snapshot a moderately large table (3 billion rows, but not a
>> huge amount of data per row).
>> Those snapshots have been timing out, so I set the following parameters to
>> relatively large values:
>> 
>>     hbase.snapshot.master.timeoutMillis
>>     hbase.snapshot.region.timeout
>>     hbase.snapshot.master.timeout.millis
>> 
>> A snapshot attempt then resulted in the terse result:
>> 
>>     ERROR: Call id=13, waitTime=60060, rpcTimeout=60000
>> 
>> A brief review of some of the hbase log files didn’t reveal anything (but
>> there are many).
>> How should I pursue getting these snapshots to work?
>> 
>> Brian

Re: snapshot timeouts

Reply via email to