No, from the output of "hdfs fs -ls" the file does exist on HDFS; In hive's context its default fs is HDFS, so it works as expected. The behavior of Kylin is wrong as it goes to "RawLocalFileSystem". Usually this is because the core-site.xml is wrongly configured or it is absent from Kylin's classpath I think. A quick workaround can be manually finding the right core-site.xml and then copy it to KYLIN_HOME/conf, restart and then resume the job. Please take a try and let us know whether it can solve. (we haven't tried HDP 2.5)
2017-02-05 21:22 GMT+08:00 磊 王 <[email protected]>: > Thanks for your reply. > I see the log shows it firstly moved that file from hdfs to local, and > then read from local. But the issue is the file to move in hdfs does not > exist at the step of move, so the local file is not available to read at > the step of read. > Is my understanding correct? > > 发自我的 iPhone > > > 在 2017年2月5日,20:54,ShaoFeng Shi <[email protected]> 写道: > > > > From the error it tried to seek the file from local file system, instead > of > > HDFS. Please check whethere "fs.DefaultFS" in your environment was set to > > local file by mistake. > > > > > > 2017-02-04 22:58 GMT+08:00 ? ? <[email protected]>: > > > >> Hi Sir, > >> > >> When I built the sample cube (and my own cube), I met the error as > below. > >> It seems the issue is at the moving step, because I confirmed there was > >> ‘hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata/ > >> kylin-6a392cfd-a903-4763-89cf-1ce44302c394/row_count/000000_0’, but > Kylin > >> was trying to move is hdfs://sandbox.hortonworks. > >> com:8020/kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf- > >> 1ce44302c394/row_count/.hive-staging_hive_2017-02-04_09-38- > >> 21_841_1217104325959051054-5/-ext-10000. > >> > >> [root@ip-10-9-255-49 ec2-user]# hdfs dfs -ls > hdfs://sandbox.hortonworks. > >> com:8020/kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf- > >> 1ce44302c394/row_count > >> Found 1 items > >> -rwxr-xr-x 1 hive hdfs 3 2017-02-04 09:38 hdfs:// > >> sandbox.hortonworks.com:8020/kylin/kylin_metadata/ > >> kylin-6a392cfd-a903-4763-89cf-1ce44302c394/row_count/000000_0 > >> > >> Environment: > >> Kylin 1.6.0 + HDP 2.5 > >> I am not sure if HDP 2.5 is a too high version, because I only see HDP > 2.4 > >> is referred in Kylin doc. > >> > >> > >> Build error: > >> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] > >> manager.ExecutableManager:292 : job id:6a392cfd-a903-4763-89cf- > 1ce44302c394-01 > >> from READY to RUNNING > >> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] hive.HiveCmdBuilder:81 : > >> The statements to execute in beeline: > >> USE default; > >> SET hive.exec.compress.output=true; > >> SET hive.auto.convert.join.noconditionaltask=true; > >> SET hive.auto.convert.join.noconditionaltask.size=100000000; > >> SET mapreduce.output.fileoutputformat.compress.type=BLOCK; > >> SET mapreduce.job.split.metainfo.maxsize=-1; > >> > >> set hive.exec.compress.output=false; > >> > >> set hive.exec.compress.output=false; > >> INSERT OVERWRITE DIRECTORY '/kylin/kylin_metadata/kylin- > >> 6a392cfd-a903-4763-89cf-1ce44302c394/row_count' SELECT count(*) FROM > >> kylin_intermediate_kylin_sales_cube_desc_0eb9faec_6a22_ > >> 4b49_889f_c283d82d72dd; > >> > >> > >> 2017-02-04 09:38:17,947 DEBUG [pool-5-thread-2] hive.HiveCmdBuilder:83 : > >> The SQL to execute in beeline: > >> > >> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : Compute row count of flat hive table, > >> cmd: > >> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : beeline -n root -u 'jdbc:hive2:// > >> sandbox.hortonworks.com:2181/;serviceDiscoveryMode= > >> zooKeeper;zooKeeperNamespace=hiveserver2' -f > >> /root/apache-kylin-1.6.0-hbase1.x-bin/bin/../tomcat/temp/beeline_ > 3013188987573907903.hql;rm > >> -f /root/apache-kylin-1.6.0-hbase1.x-bin/bin/../tomcat/temp/beeline_ > >> 3013188987573907903.hql > >> 2017-02-04 09:38:19,832 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : WARNING: Use "yarn jar" to launch > YARN > >> applications. > >> 2017-02-04 09:38:20,342 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : Connecting to jdbc:hive2://sandbox. > >> hortonworks.com:2181/;serviceDiscoveryMode= > zooKeeper;zooKeeperNamespace= > >> hiveserver2 > >> 2017-02-04 09:38:21,629 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : Connected to: Apache Hive (version > >> 1.2.1000.2.5.0.0-1245) > >> 2017-02-04 09:38:21,630 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : Driver: Hive JDBC (version > >> 1.2.1.2.3.2.0-2950) > >> 2017-02-04 09:38:21,630 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : Transaction isolation: > >> TRANSACTION_REPEATABLE_READ > >> 2017-02-04 09:38:21,679 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> USE default; > >> 2017-02-04 09:38:21,732 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.051 seconds) > >> 2017-02-04 09:38:21,746 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> SET hive.exec.compress.output=true; > >> 2017-02-04 09:38:21,757 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.011 seconds) > >> 2017-02-04 09:38:21,766 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> SET hive.auto.convert.join. > noconditionaltask=true; > >> 2017-02-04 09:38:21,768 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.002 seconds) > >> 2017-02-04 09:38:21,777 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> SET hive.auto.convert.join. > noconditionaltask.size= > >> 100000000; > >> 2017-02-04 09:38:21,779 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.002 seconds) > >> 2017-02-04 09:38:21,787 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> SET mapreduce.output.fileoutputformat.compress. > >> type=BLOCK; > >> 2017-02-04 09:38:21,790 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.003 seconds) > >> 2017-02-04 09:38:21,796 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> SET mapreduce.job.split.metainfo.maxsize=-1; > >> 2017-02-04 09:38:21,801 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.005 seconds) > >> 2017-02-04 09:38:21,804 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> > >> 2017-02-04 09:38:21,808 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> set hive.exec.compress.output=false; > >> 2017-02-04 09:38:21,811 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.003 seconds) > >> 2017-02-04 09:38:21,812 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> > >> 2017-02-04 09:38:21,816 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> set hive.exec.compress.output=false; > >> 2017-02-04 09:38:21,818 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (0.002 seconds) > >> 2017-02-04 09:38:21,833 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> INSERT OVERWRITE DIRECTORY > >> '/kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf-1c > >> e44302c394/row_count' SELECT count(*) FROM k > >> 2017-02-04 09:38:21,840 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : ylin_intermediate_kylin_sales_ > >> cube_desc_0eb9faec_6a22_4b49_889f_c283d82d72dd; > >> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Tez session hasn't been > created > >> yet. Opening session > >> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Dag name: INSERT OVERWRITE > >> DIRE...49_889f_c283d82d72dd(Stage-1) > >> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : > >> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : > >> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Status: Running (Executing on > >> YARN cluster with App id application_1486198519344_0004) > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Map 1: -/- Reducer 2: 0/1 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Map 1: 0/1 Reducer 2: 0/1 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Map 1: 0(+1)/1 Reducer > >> 2: 0/1 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Map 1: 1/1 Reducer 2: 0/1 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Map 1: 1/1 Reducer 2: > 0(+1)/1 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Map 1: 1/1 Reducer 2: 1/1 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : INFO : Moving data to directory > >> /kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf- > 1ce44302c394/row_count > >> from hdfs://sandbox.h > >> ortonworks.com:8020/kylin/kylin_metadata/kylin-6a392cfd- > >> a903-4763-89cf-1ce44302c394/row_count/.hive-staging_hive_ > >> 2017-02-04_09-38-21_841_1217104325959051054-5/-ext-10000 > >> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : No rows affected (8.613 seconds) > >> 2017-02-04 09:38:30,457 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> > >> 2017-02-04 09:38:30,457 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/> > >> 2017-02-04 09:38:30,458 INFO [pool-5-thread-2] > >> execution.AbstractExecutable:36 : Closing: 0: jdbc:hive2://sandbox. > >> hortonworks.com:2181/;serviceDiscoveryMode= > zooKeeper;zooKeeperNamespace= > >> hiveserver2 > >> 2017-02-04 09:38:30,573 ERROR [pool-5-thread-2] > >> execution.AbstractExecutable:370 : job:6a392cfd-a903-4763-89cf- > 1ce44302c394-01 > >> execute finished with exception > >> java.io.FileNotFoundException: File /kylin/kylin_metadata/kylin- > >> 6a392cfd-a903-4763-89cf-1ce44302c394/row_count does not exist > >> at org.apache.hadoop.fs.RawLocalFileSystem.listStatus( > >> RawLocalFileSystem.java:429) > >> at org.apache.hadoop.fs.FileSystem.listStatus( > >> FileSystem.java:1515) > >> at org.apache.hadoop.fs.FileSystem.listStatus( > >> FileSystem.java:1555) > >> at org.apache.hadoop.fs.ChecksumFileSystem.listStatus( > >> ChecksumFileSystem.java:574) > >> at org.apache.kylin.source.hive.HiveMRInput$ > >> RedistributeFlatHiveTableStep.doWork(HiveMRInput.java:338) > >> at org.apache.kylin.job.execution.AbstractExecutable. > >> execute(AbstractExecutable.java:113) > >> at org.apache.kylin.job.execution.DefaultChainedExecutable. > doWork( > >> DefaultChainedExecutable.java:57) > >> at org.apache.kylin.job.execution.AbstractExecutable. > >> execute(AbstractExecutable.java:113) > >> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > >> JobRunner.run(DefaultScheduler.java:136) > >> > >> > >> > >> Thx > >> Lei Wang > >> > >> > > > > > > -- > > Best regards, > > > > Shaofeng Shi 史少锋 > -- Best regards, Shaofeng Shi 史少锋
