Updated the document, please refresh: https://kylin.apache.org/docs21/install/kylin_aws_emr.html I'm not sure whether it addresses your case, but it worth a try. (You can re-create the EMR with the same bucket, setting the configuration when start it)
2017-11-16 11:07 GMT+08:00 ShaoFeng Shi <[email protected]>: > Oh, I forgot to mention that, need set a larger timeout for HBase. > > HBase bulk load operation is a move/rename operation, on S3 it is a copy; > When the cube is huge, this step may take a much longer time than on HDFS. > HBase may report timeout error in this case. > > In our side, we set "hbase.rpc.timeout=3600000" when starting the EMR > cluster. I learned this from community user Alexander: > > http://apache-kylin.74782.x6.nabble.com/Re-HFile-is-empty- > if-kylin-hbase-cluster-fs-is-set-to-s3-tc8983.html#none > > > > 2017-11-16 10:25 GMT+08:00 jxs <[email protected]>: > >> Hi, ShaoFeng, >> >> Following the "Install Kylin on AWS EMR" doc and previous discussing >> between you and Roberto Tardío, >> I setup a cluster using S3 as hbase storage and kylin working directory. >> >> Yesterday morning, I have built a small cube without error. but the build >> job from 70M million records >> with 8 dimensions failed at the step "Load HFile to HBase Table" at about >> 08:00 GTM+8, the job is launched at afternoon. >> From ganglia monitor, I can see the cluster is still busying at network >> IO with 30-35 MBytes / sec In and Out. >> >> The log says "Thu Nov 16 07:37:31 GMT+08:00 2017, >> RpcRetryingCaller{globalStartTime=1510788493020, pause=100, retries=35}, >> org.apache.hadoop.hbase.RegionTooBusyException: >> org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in >> 60000 ms. >> regionName=KYLIN_FGVR6DWDPO,\x00\x09,1510751512681.61ac967159fb48cde19f82f14c615de0., >> server=ip-172-31-7-10.cn-north-1.compute.internal,16020,1510712998368". >> >> Can you help to point where the problem may be? The cluster capacity or >> S3 speed or in Kylin, or something else? >> What can I do to make this working? >> >> Below is the cluster's network load and last one hour cpu stacked load. >> >> Best regards. >> >> >> >> >> >> > > > -- > Best regards, > > Shaofeng Shi 史少锋 > > -- Best regards, Shaofeng Shi 史少锋
