Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-11 Thread Alexander Sterligov
What if we shall add direct output in kylin_job_conf.xml and kylin_job_conf_inmem.xml? hbase.zookeeper.quorum for example doesn't work if not specified in these configs. On Fri, Aug 11, 2017 at 3:13 PM, ShaoFeng Shi wrote: > EMR enables the direct output in

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-11 Thread ShaoFeng Shi
EMR enables the direct output in mapred-site.xml, while in this step it seems these settings doesn't work (althoug the job's configuration shows they are there). I disabled the direct output but the behavior has no change. I did some search but no finding. I need drop the EMR now, and may get back

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-11 Thread Alexander Sterligov
Any ideas how to fix that? On Fri, Aug 11, 2017 at 2:16 PM, ShaoFeng Shi wrote: > I got the same problem as you: > > 2017-08-11 08:44:16,342 WARN [Job 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255] > mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did not find >

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-11 Thread ShaoFeng Shi
I got the same problem as you: 2017-08-11 08:44:16,342 WARN [Job 2c86b4b6-7639-4a97-ba63-63c9dca095f6-2255] mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did not find any files to load in directory

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-11 Thread Alexander Sterligov
No, defaultFs is hdfs. I’ve seen such behavior when set working dir to s3, but didn’t set cluster-fs at all. Maybe you have a typo in the name of the property. I used the old one «kylin.hbase.cluster.fs» When both working-dir and cluster-fs were set to s3 I got _temporary dir of convert job

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-11 Thread ShaoFeng Shi
Hi Alexander, That makes sense. Using S3 for Cube build and storage is required for a cloud hadoop environment. I tried to reproduce this problem. I created a EMR with S3 as HBase storage, in kylin.properties, I set "kylin.env.hdfs-working-dir" and "kylin.storage.hbase.cluster-fs" to the S3

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-10 Thread Alexander Sterligov
Yes, I workarounded this problem in such way and it works. One problem of such solution is that I have to use pretty large hdfs and it'expensive. And also I have to manually garbage collect it, because it is not moved to s3, but copied. Kylin cleanup job doesn't work for it, because main metadata

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-10 Thread ShaoFeng Shi
How about leaving empty for "kylin.hbase.cluster.fs"? This property is for two-cluster deployment (one Hadoop for cube build, the other for query); When be empty, the HFile will be written to default fs (HDFS in EMR), and then load to HBase. I'm not sure whether EMR HBase (using S3 as storage)

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-10 Thread Alexander Sterligov
I also thought about it, but no, it's not consistency. Consistency view is enabled. I use same s3 for my own map-reduce jobs and it's ok. I also checked if it lost consistency (emrfs diff). No problems. In case of inconsistency of s3 files disappear right after they were written and appear some

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-10 Thread ShaoFeng Shi
Did you enable the Consistent View? This article explains the challenge when using S3 directly for ETL process: https://aws.amazon.com/cn/blogs/big-data/ensuring-consistency-when-using-amazon-s3-and-amazon-elastic-mapreduce-for-etl-workflows/ 2017-08-09 18:19 GMT+08:00 Alexander Sterligov

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-09 Thread Alexander Sterligov
Yes, it's empty. Also I see this message in the log: 2017-08-09 09:02:35,947 WARN [Job 1e436685-7102-4621-a4cb-6472b866126d-7608] mapreduce.LoadIncrementalHFiles:234 : Skipping non-directory s3://joom.emr.fs/home/production/bi/kylin/kylin_metadata/kylin-1e436685-7102-4621-a4cb-6472b866126d

Re: HFile is empty if kylin.hbase.cluster.fs is set to s3

2017-08-09 Thread ShaoFeng Shi
The HFile will be moved to HBase data folder when bulk load finished; Did you check whether the HTable has data? 2017-08-09 17:54 GMT+08:00 Alexander Sterligov : > Hi! > > I set kylin.hbase.cluster.fs to s3 bucket where hbase lives. > > Step "Convert Cuboid Data to HFile"