Sorry, I usually include that info. HBase version is 0.98. hbase.rpc.timeout is
the default.
When the ‘ERROR: Call id….’ occurred, there was no stack trace. That was the
entire error output.
Before I increased the snapshot timeout parameters, the timeout I was seeing
looked like:
ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot {
ss=Host-bdj table=Host type=FLUSH } had an error. Procedure Host-bdj {
waiting=[] done=[host-22.hdfs.foo.net,60020,1410543068459,
host-24.hdfs.foo.net,60020,1412603246174,
host-17.hdfs.foo.net,60020,1410543059186,
host-19.hdfs.foo.net,60020,1412419924491,
host-20.hdfs.foo.net,60020,1412419942143,
host-16.hdfs.foo.net,60020,1403178964733,
host-15.hdfs.foo.net,60020,1403178962029,
host-21.hdfs.foo.net,60020,1403178959748,
host-23.hdfs.foo.net,60020,1410543079248,
host-18.hdfs.foo.net,60020,1410543061865] }
at
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:366)
at
org.apache.hadoop.hbase.master.HMaster.isSnapshotDone(HMaster.java:2993)
at
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38245)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2008)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92)
at
org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.hadoop.hbase.errorhandling.TimeoutException via
timer-java.util.Timer@3097c4e1:org.apache.hadoop.hbase.errorhandling.TimeoutException:
Timeout elapsed! Source:Timeout caused Foreign Exception Start:1412792382137,
End:1412792442137, diff:60000, max:60000 ms
at
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
at
org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:318)
at
org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:356)
... 10 more
Caused by: org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout
elapsed! Source:Timeout caused Foreign Exception Start:1412792382137,
End:1412792442137, diff:60000, max:60000 ms
at
org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:67)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
On Oct 8, 2014, at 3:18 PM, Ted Yu <[email protected]> wrote:
> Can you give a bit more information :
>
> the release of hbase you're using
> value for hbase.rpc.timeout (looks like you leave it @ default)
> more of the error (please include stack trace if possible)
>
> Cheers
>
> On Wed, Oct 8, 2014 at 12:09 PM, Brian Jeltema <
> [email protected]> wrote:
>
>> I’m trying to snapshot a moderately large table (3 billion rows, but not a
>> huge amount of data per row).
>> Those snapshots have been timing out, so I set the following parameters to
>> relatively large values:
>>
>> hbase.snapshot.master.timeoutMillis
>> hbase.snapshot.region.timeout
>> hbase.snapshot.master.timeout.millis
>>
>> A snapshot attempt then resulted in the terse result:
>>
>> ERROR: Call id=13, waitTime=60060, rpcTimeout=60000
>>
>> A brief review of some of the hbase log files didn’t reveal anything (but
>> there are many).
>> How should I pursue getting these snapshots to work?
>>
>> Brian