Re: Re: Strange HBase rpc operation timeout error

Billy Liu Sun, 17 Dec 2017 01:22:56 -0800

Actually, in your questions, here are two HBase timeout. One is about the
Cube build, the other one is metadata access.
For the first issue, please check this article:
http://kylin.apache.org/docs21/install/kylin_aws_emr.html  It introduces
how to increase the HBase rpc timeout.
For the second issue, as previous discussion. We should keep it.


2017-12-17 10:37 GMT+08:00 jxs <[email protected]>:

> Hi Billy,
> Thank you for pointing the previous discussion. But for now we are running
> a very small hbase cluster for lower cost, which has only one slave node.
> So the unsteady response time (in a range not two bad, eg: within 1
> minute) is somehow acceptable.
> The previous timeout error just interrupted the cube building procedure,
> we don't wan't that.
> What is your suggestion for this use case?
>
>
>
> 在2017年12月16 11时48分, "Billy Liu"<[email protected]>写道:
>
>
> Check this: http://apache-kylin.74782.x6.nabble.com/hbase-configed-
> with-fixed-value-td9241.html
>
> 2017-12-15 18:03 GMT+08:00 jxs <[email protected]>:
>
>> Hi,
>>
>> Finally, I found this in org.apache.kylin.storage.hbase
>> .HBaseResourceStore:
>>
>> ```
>>     private StorageURL buildMetadataUrl(KylinConfig kylinConfig) throws
>> IOException {
>>         StorageURL url = kylinConfig.getMetadataUrl();
>>         if (!url.getScheme().equals("hbase"))
>>             throw new IOException("Cannot create HBaseResourceStore. Url
>> not match. Url: " + url);
>>
>>         // control timeout for prompt error report
>>         Map<String, String> newParams = new LinkedHashMap<>();
>>         newParams.put("hbase.client.scanner.timeout.period", "10000");
>>         newParams.put("hbase.rpc.timeout", "5000");
>>         newParams.put("hbase.client.retries.number", "1");
>>         newParams.putAll(url.getAllParameters());
>>
>>         return url.copy(newParams);
>>     }
>> ```
>> Is this related to the timeout error? Why these params are hard coded
>> instead of reading from configuration, is there any workaround for this
>> timeout error?
>>
>>
>> 在2017年12月15 16时03分, "jxs"<[email protected]>写道:
>>
>>
>> Hi, kylin users,
>>
>> I encountered an strange timeout error today when buiding a cube.
>>
>> By "strange", I mean the "hbase.rpc.timeout" configuration is set to
>> 60000 in hbase, but I get "org.apache.hadoop.hbase.ipc.CallTimeoutException:
>> Call id=8099904, waitTime=5001, operationTimeout=5000 expired" errors.
>>
>> Kylin version 2.2.0, runs on EMR, and it runs wihtout error for about
>> half of a month, suddenly it not work, the current cube is not the biggest
>> one.
>> I am wondering where should I look, any help is appreciated.
>>
>> The traceback from log:
>>
>> ```
>> 2017-12-15 06:46:57,892 ERROR [Scheduler 2090031901 <020%209003%201901>
>> Job c9067736-eac7-48ad-88f3-dbd6f4e870ae-167]
>> execution.ExecutableManager:149 : fail to get job
>> output:c9067736-eac7-48ad-88f3-dbd6f4e870ae-14
>> org.apache.kylin.job.exception.PersistentException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>> attempts=1, exceptions:
>> Fri Dec 15 14:46:57 GMT+08:00 2017, 
>> RpcRetryingCaller{globalStartTime=1513320412890,
>> pause=100, retries=1}, java.io.IOException: Call to
>> ip-172-31-5-71.cn-north-1.compute.internal/172.31.5.71:16020 failed on
>> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
>> id=8099904, waitTime=5001, operationTimeout=5000 expired.
>>
>>         at org.apache.kylin.job.dao.ExecutableDao.getJobOutput(Executab
>> leDao.java:202)
>>         at org.apache.kylin.job.execution.ExecutableManager.getOutput(
>> ExecutableManager.java:145)
>>         at org.apache.kylin.job.execution.AbstractExecutable.getOutput(
>> AbstractExecutable.java:312)
>>         at org.apache.kylin.job.execution.AbstractExecutable.isDiscarde
>> d(AbstractExecutable.java:392)
>>         at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork
>> (MapReduceExecutable.java:149)
>>         at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:125)
>>         at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>> rk(DefaultChainedExecutable.java:64)
>>         at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:125)
>>         at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
>> ner.run(DefaultScheduler.java:144)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1149)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:624)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> Failed after attempts=1, exceptions:
>> Fri Dec 15 14:46:57 GMT+08:00 2017, 
>> RpcRetryingCaller{globalStartTime=1513320412890,
>> pause=100, retries=1}, java.io.IOException: Call to
>> ip-172-31-5-71.cn-north-1.compute.internal/172.31.5.71:16020 failed on
>> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
>> id=8099904, waitTime=5001, operationTimeout=5000 expired.
>> ```
>>
>>
>>
>

Re: Re: Strange HBase rpc operation timeout error

Reply via email to