Hi Alexander,
I encounter a problem when using HDFS for cubing building, and S3 for HBase
on EMR. In the "Load HFile to HBase Table" step, Kylin got a failure with
time out error:
Thu Sep 07 15:33:27 GMT+08:00 2017,
RpcRetryingCaller{globalStartTime=1504769048975, pause=100, retries=35},
java.io.IOException: Call to ip-10-0-0-28.ec2.internal/10.0.0.28:16020
failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=41,
waitTime=60001, operationTimeout=60000
In HBase region server, I saw HBase uploads the HFile to S3; Since the cube
is a little big (13GB), it takes much longer time than usual. Kylin client
closed the connection as it thought timeout:
2017-09-07 08:01:12,275 INFO
[RpcServer.FifoWFPBQ.default.handler=16,queue=1,port=16020]
regionserver.HRegionFileSystem: Bulk-load file
hdfs://ip-10-0-0-118.ec2.internal:8020/kylin/kylin_default_instance/kylin-cdcb5f57-2ea9-47d9-85db-7a6c7490cc55/test/hfile/F1/a897b4d33ed648e6a5d0bfb05cffdfd6
is on different filesystem than the destination store. Copying file over to
destination filesystem.
2017-09-07 08:01:23,919 INFO
[RpcServer.FifoWFPBQ.default.handler=22,queue=1,port=16020]
s3.MultipartUploadManager: completed multipart upload of 8 parts 965420145
bytes
2017-09-07 08:26:33,838 WARN
[RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow):
{"call":"BulkLoadHFile(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFileRequest)","starttimems":1504770958916,"responsesize":2,"method":"BulkLoadHFile","param":"TODO:
class
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$BulkLoadHFileRequest","processingtimems":1834922,"client":"
10.0.0.243:49152","queuetimems":0,"class":"HRegionServer"}
2017-09-07 08:26:33,838 WARN
[RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020] ipc.RpcServer:
RpcServer.FifoWFPBQ.default.handler=20,queue=2,port=16020: caught a
ClosedChannelException, this means that the server /10.0.0.28:16020 was
processing a request but the client went away. The error message was: null
So I wonder how did you bypass this problem, did you set a very large
timeout value for HBase, or your cube size isn't that big? Thanks.
2017-08-14 14:19 GMT+08:00 Alexander Sterligov <[email protected]>:
> Here is ticket for hfile on s3 issue - https://issues.apache.org/
> jira/browse/KYLIN-2788
>
> On Mon, Aug 14, 2017 at 9:17 AM, Alexander Sterligov <[email protected]>
> wrote:
>
>> I forgot there was one more issue with s3 - https://issues.apache.org/ji
>> ra/browse/KYLIN-2740.
>>
>> Global dictionary in 2.0 doesn't work out of the box. I patched kylin as
>> described in ticket.
>>
>> On Sun, Aug 13, 2017 at 4:24 AM, ShaoFeng Shi <[email protected]>
>> wrote:
>>
>>> Nice; For the writting hfile to S3 issue, it need more
>>> investigation. Please open a Kylin JIRA for tracking. We will update there
>>> if has any finding.
>>>
>>> 2017-08-12 23:52 GMT+08:00 Alexander Sterligov <[email protected]>:
>>>
>>>> Query performance is pretty same as on slides about kylin. I have high
>>>> bucket cache hit (>90%), so data is almost always read from local disk. For
>>>> some other use cases it might be different.
>>>>
>>>> 12 авг. 2017 г. 17:59 пользователь "ShaoFeng Shi" <
>>>> [email protected]> написал:
>>>>
>>>> Cool; how about the query performance with data on s3?
>>>>
>>>> 2017-08-11 23:27 GMT+08:00 Alexander Sterligov <[email protected]>:
>>>>
>>>>> Yes, that's the only one fow now.
>>>>>
>>>>> On Fri, Aug 11, 2017 at 6:23 PM, ShaoFeng Shi <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> No need to add I think, because I see they already in the
>>>>>> configuration of that step.
>>>>>>
>>>>>> Is this the only issue you see with Kylin on EMR+S3?
>>>>>>
>>>>>> [image: 内嵌图片 1]
>>>>>>
>>>>>> 2017-08-11 20:26 GMT+08:00 Alexander Sterligov <[email protected]>:
>>>>>>
>>>>>>> What if we shall add direct output in kylin_job_conf.xml
>>>>>>> and kylin_job_conf_inmem.xml?
>>>>>>>
>>>>>>> hbase.zookeeper.quorum for example doesn't work if not specified in
>>>>>>> these configs.
>>>>>>>
>>>>>>> On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> EMR enables the direct output in mapred-site.xml, while in this
>>>>>>>> step it seems these settings doesn't work (althoug the job's
>>>>>>>> configuration
>>>>>>>> shows they are there). I disabled the direct output but the behavior
>>>>>>>> has no
>>>>>>>> change. I did some search but no finding. I need drop the EMR now, and
>>>>>>>> may
>>>>>>>> get back it later.
>>>>>>>>
>>>>>>>> If you have any idea or findings, please share it. We'd like to
>>>>>>>> make Kylin has better support for cloud.
>>>>>>>>
>>>>>>>> Thanks for your feedback!
>>>>>>>>
>>>>>>>> 2017-08-11 19:19 GMT+08:00 Alexander Sterligov <[email protected]
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> Any ideas how to fix that?
>>>>>>>>>
>>>>>>>>> On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> I got the same problem as you:
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 08:44:16,342 WARN [Job
>>>>>>>>>> 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255]
>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did
>>>>>>>>>> not find any files to load in directory
>>>>>>>>>> s3://privatekeybucket-anac5h41
>>>>>>>>>> 523l/kylin/kylin_default_instance/kylin-2c86b4b6-7639-4a97-b
>>>>>>>>>> a63-63c9dca095f6/kylin_sales_cube_clone3/hfile. Does it contain
>>>>>>>>>> files in subdirectories that correspond to column family names?
>>>>>>>>>>
>>>>>>>>>> In S3 view, I see the files exist in "_temporary" folder, seems
>>>>>>>>>> were not moved to the target folder on complete. It seems EMR try to
>>>>>>>>>> direct
>>>>>>>>>> write to otuput path, but actually not.
>>>>>>>>>>
>>>>>>>>>> 2017-08-11 16:34 GMT+08:00 Alexander Sterligov <
>>>>>>>>>> [email protected]>:
>>>>>>>>>>
>>>>>>>>>>> No, defaultFs is hdfs.
>>>>>>>>>>>
>>>>>>>>>>> I’ve seen such behavior when set working dir to s3, but didn’t
>>>>>>>>>>> set cluster-fs at all. Maybe you have a typo in the name of the
>>>>>>>>>>> property. I
>>>>>>>>>>> used the old one «kylin.hbase.cluster.fs»
>>>>>>>>>>>
>>>>>>>>>>> When both working-dir and cluster-fs were set to s3 I got
>>>>>>>>>>> _temporary dir of convert job at s3, but no hfiles. Also I saw
>>>>>>>>>>> correct
>>>>>>>>>>> output path for the job in the log. But I didn’t check if job
>>>>>>>>>>> creates
>>>>>>>>>>> temporary files in s3, but then copies results to hdfs. I hardly
>>>>>>>>>>> believe it
>>>>>>>>>>> happens.
>>>>>>>>>>>
>>>>>>>>>>> Do you see proper arguments for the step in the log?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 11 авг. 2017 г., в 11:17, ShaoFeng Shi <[email protected]>
>>>>>>>>>>> написал(а):
>>>>>>>>>>>
>>>>>>>>>>> Hi Alexander,
>>>>>>>>>>>
>>>>>>>>>>> That makes sense. Using S3 for Cube build and storage is
>>>>>>>>>>> required for a cloud hadoop environment.
>>>>>>>>>>>
>>>>>>>>>>> I tried to reproduce this problem. I created a EMR with S3 as
>>>>>>>>>>> HBase storage, in kylin.properties, I set
>>>>>>>>>>> "kylin.env.hdfs-working-dir"
>>>>>>>>>>> and "kylin.storage.hbase.cluster-fs" to the S3 bucket. But in
>>>>>>>>>>> the "Convert Cuboid Data to HFile" step, Kylin still writes to
>>>>>>>>>>> local HDFS; Did you modify the core-site.xml to make S3 as the
>>>>>>>>>>> default FS?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2017-08-10 22:53 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>
>>>>>>>>>>>> Yes, I workarounded this problem in such way and it works.
>>>>>>>>>>>>
>>>>>>>>>>>> One problem of such solution is that I have to use pretty large
>>>>>>>>>>>> hdfs and it'expensive. And also I have to manually garbage collect
>>>>>>>>>>>> it,
>>>>>>>>>>>> because it is not moved to s3, but copied. Kylin cleanup job
>>>>>>>>>>>> doesn't work
>>>>>>>>>>>> for it, because main metadata folder is at s3. So it would be
>>>>>>>>>>>> really nice
>>>>>>>>>>>> to put everything to s3.
>>>>>>>>>>>>
>>>>>>>>>>>> Another problem is that I had to rise hbase rpc timeout,
>>>>>>>>>>>> because bulk loading from hdfs takes long. That was not trivial. 3
>>>>>>>>>>>> minutes
>>>>>>>>>>>> work good, but with drawback of queries or metadata writes handing
>>>>>>>>>>>> for 3
>>>>>>>>>>>> minutes if something bad happen. But that's rare event.
>>>>>>>>>>>>
>>>>>>>>>>>> 10 авг. 2017 г. 17:42 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>> [email protected]> написал:
>>>>>>>>>>>>
>>>>>>>>>>>> How about leaving empty for "kylin.hbase.cluster.fs"? This
>>>>>>>>>>>>> property is for two-cluster deployment (one Hadoop for cube
>>>>>>>>>>>>> build, the
>>>>>>>>>>>>> other for query);
>>>>>>>>>>>>>
>>>>>>>>>>>>> When be empty, the HFile will be written to default fs (HDFS
>>>>>>>>>>>>> in EMR), and then load to HBase. I'm not sure whether EMR HBase
>>>>>>>>>>>>> (using S3
>>>>>>>>>>>>> as storage) can bulk load files from HDFS or not. If it can, that
>>>>>>>>>>>>> would be
>>>>>>>>>>>>> great as the write performance of HDFS would be better than S3.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-10 22:29 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also thought about it, but no, it's not consistency.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Consistency view is enabled. I use same s3 for my own
>>>>>>>>>>>>>> map-reduce jobs and it's ok.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also checked if it lost consistency (emrfs diff). No
>>>>>>>>>>>>>> problems.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In case of inconsistency of s3 files disappear right after
>>>>>>>>>>>>>> they were written and appear some time after. Hfiles didn't
>>>>>>>>>>>>>> appear after a
>>>>>>>>>>>>>> day, but _template is there.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's 100% reproducable, I think I'll investigate this problem
>>>>>>>>>>>>>> by running conversion job manually.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 10 авг. 2017 г. 17:18 пользователь "ShaoFeng Shi" <
>>>>>>>>>>>>>> [email protected]> написал:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Did you enable the Consistent View? This article explains the
>>>>>>>>>>>>>>> challenge when using S3 directly for ETL process:
>>>>>>>>>>>>>>> https://aws.amazon.com/cn/blogs/big-data/ensuring-consistenc
>>>>>>>>>>>>>>> y-when-using-amazon-s3-and-amazon-elastic-mapreduce-for-etl-
>>>>>>>>>>>>>>> workflows/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2017-08-09 18:19 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, it's empty. Also I see this message in the log:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-08-09 09:02:35,947 WARN [Job
>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:234 : Skipping
>>>>>>>>>>>>>>>> non-directory s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-6472b866126d
>>>>>>>>>>>>>>>> /main_event_1_main/hfile/_SUCCESS
>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,009 WARN [Job
>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:252 : Skipping non-file
>>>>>>>>>>>>>>>> FileStatusExt{path=s3://joom.e
>>>>>>>>>>>>>>>> mr.fs/home/production/bi/kylin
>>>>>>>>>>>>>>>> /kylin_metadata/kylin-1e436685
>>>>>>>>>>>>>>>> -7102-4621-a4cb-6472b866126d/m
>>>>>>>>>>>>>>>> ain_event_1_main/hfile/_temporary/1; isDirectory=true;
>>>>>>>>>>>>>>>> modification_time=0; access_time=0; owner=; group=;
>>>>>>>>>>>>>>>> permission=rwxrwxrwx;
>>>>>>>>>>>>>>>> isSymlink=false}
>>>>>>>>>>>>>>>> 2017-08-09 09:02:36,014 WARN [Job
>>>>>>>>>>>>>>>> 1e436685-7102-4621-a4cb-6472b866126d-7608]
>>>>>>>>>>>>>>>> mapreduce.LoadIncrementalHFiles:422 : Bulk load operation
>>>>>>>>>>>>>>>> did not find any files to load in directory
>>>>>>>>>>>>>>>> s3://joom.emr.fs/home/producti
>>>>>>>>>>>>>>>> on/bi/kylin/kylin_metadata/kyl
>>>>>>>>>>>>>>>> in-1e436685-7102-4621-a4cb-647
>>>>>>>>>>>>>>>> 2b866126d/main_event_1_main/hfile. Does it contain files
>>>>>>>>>>>>>>>> in subdirectories that correspond to column family names?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The HFile will be moved to HBase data folder when bulk
>>>>>>>>>>>>>>>>> load finished; Did you check whether the HTable has data?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <
>>>>>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I set kylin.hbase.cluster.fs to s3 bucket where hbase
>>>>>>>>>>>>>>>>>> lives.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Step "Convert Cuboid Data to HFile" finished without
>>>>>>>>>>>>>>>>>> errors. Statistics at the end of the job said that it has
>>>>>>>>>>>>>>>>>> written lot's of
>>>>>>>>>>>>>>>>>> data to s3.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> But there is no hfiles in kylin_metadata folder
>>>>>>>>>>>>>>>>>> (kylin_metadata
>>>>>>>>>>>>>>>>>> /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table
>>>>>>>>>>>>>>>>>> name>/hfile), but only _temporary folder and _SUCCESS file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _temporary contains hfiles inside attempt folders. it
>>>>>>>>>>>>>>>>>> looks like there were not copied from _temporary to result
>>>>>>>>>>>>>>>>>> dir. But there
>>>>>>>>>>>>>>>>>> is no errors neither in kylin log, nor in reducers' logs.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Then loading empty hfiles produces empty segments.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is that a bug or I'm doing something wrong?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>>
>>>>>> Shaofeng Shi 史少锋
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>
--
Best regards,
Shaofeng Shi 史少锋