Did you enable the Consistent View? This article explains the challenge when using S3 directly for ETL process: https://aws.amazon.com/cn/blogs/big-data/ensuring-consistency-when-using-amazon-s3-and-amazon-elastic-mapreduce-for-etl-workflows/
2017-08-09 18:19 GMT+08:00 Alexander Sterligov <[email protected]>: > Yes, it's empty. Also I see this message in the log: > > 2017-08-09 09:02:35,947 WARN [Job 1e436685-7102-4621-a4cb-6472b866126d-7608] > mapreduce.LoadIncrementalHFiles:234 : Skipping non-directory > s3://joom.emr.fs/home/production/bi/kylin/kylin_ > metadata/kylin-1e436685-7102-4621-a4cb-6472b866126d > /main_event_1_main/hfile/_SUCCESS > 2017-08-09 09:02:36,009 WARN [Job 1e436685-7102-4621-a4cb-6472b866126d-7608] > mapreduce.LoadIncrementalHFiles:252 : Skipping non-file > FileStatusExt{path=s3://joom.emr.fs/home/production/bi/ > kylin/kylin_metadata/kylin-1e436685-7102-4621-a4cb- > 6472b866126d/main_event_1_main/hfile/_temporary/1; isDirectory=true; > modification_time=0; access_time=0; owner=; group=; permission=rwxrwxrwx; > isSymlink=false} > 2017-08-09 09:02:36,014 WARN [Job 1e436685-7102-4621-a4cb-6472b866126d-7608] > mapreduce.LoadIncrementalHFiles:422 : Bulk load operation did not find > any files to load in directory s3://joom.emr.fs/home/ > production/bi/kylin/kylin_metadata/kylin-1e436685-7102- > 4621-a4cb-6472b866126d/main_event_1_main/hfile. Does it contain files in > subdirectories that correspond to column family names? > > On Wed, Aug 9, 2017 at 1:15 PM, ShaoFeng Shi <[email protected]> > wrote: > >> The HFile will be moved to HBase data folder when bulk load finished; Did >> you check whether the HTable has data? >> >> 2017-08-09 17:54 GMT+08:00 Alexander Sterligov <[email protected]>: >> >>> Hi! >>> >>> I set kylin.hbase.cluster.fs to s3 bucket where hbase lives. >>> >>> Step "Convert Cuboid Data to HFile" finished without errors. Statistics >>> at the end of the job said that it has written lot's of data to s3. >>> >>> But there is no hfiles in kylin_metadata folder (kylin_metadata >>> /kylin-1e436685-7102-4621-a4cb-6472b866126d/<table name>/hfile), but >>> only _temporary folder and _SUCCESS file. >>> >>> _temporary contains hfiles inside attempt folders. it looks like there >>> were not copied from _temporary to result dir. But there is no errors >>> neither in kylin log, nor in reducers' logs. >>> >>> Then loading empty hfiles produces empty segments. >>> >>> Is that a bug or I'm doing something wrong? >>> >>> >>> >>> >>> >> >> >> -- >> Best regards, >> >> Shaofeng Shi 史少锋 >> >> > -- Best regards, Shaofeng Shi 史少锋
