Oh, I forgot to mention that, need set a larger timeout for HBase.

HBase bulk load operation is a move/rename operation, on S3 it is a copy;
When the cube is huge, this step may take a much longer time than on HDFS.
HBase may report timeout error in this case.

In our side, we set "hbase.rpc.timeout=3600000" when starting the EMR
cluster. I learned this from community user Alexander:

http://apache-kylin.74782.x6.nabble.com/Re-HFile-is-empty-if-kylin-hbase-cluster-fs-is-set-to-s3-tc8983.html#none



2017-11-16 10:25 GMT+08:00 jxs <[email protected]>:

> Hi, ShaoFeng,
>
> Following the "Install Kylin on AWS EMR" doc and previous discussing
> between you and Roberto Tardío,
> I setup a cluster using S3 as hbase storage and kylin working directory.
>
> Yesterday morning, I have built a small cube without error. but the build
> job from 70M million records
> with 8 dimensions failed at the step "Load HFile to HBase Table" at about
> 08:00 GTM+8, the job is launched at afternoon.
> From ganglia monitor, I can see the cluster is still busying at network IO
> with 30-35 MBytes / sec In and Out.
>
> The log says "Thu Nov 16 07:37:31 GMT+08:00 2017, 
> RpcRetryingCaller{globalStartTime=1510788493020,
> pause=100, retries=35}, org.apache.hadoop.hbase.RegionTooBusyException:
> org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in
> 60000 ms. regionName=KYLIN_FGVR6DWDPO,\x00\x09,1510751512681.
> 61ac967159fb48cde19f82f14c615de0., server=ip-172-31-7-10.cn-
> north-1.compute.internal,16020,1510712998368".
>
> Can you help to point where the problem may be? The cluster capacity or S3
> speed or in Kylin, or something else?
> What can I do to make this working?
>
> Below is the cluster's network load and last one hour cpu stacked load.
>
> Best regards.
>
>
>
>
>
>


-- 
Best regards,

Shaofeng Shi 史少锋

Reply via email to