Thanks your reply again. I will follow your suggestion to try it. Some discussion- I did not touch core-site.xml under Kylin conf folder, so it should be the default. I do not think the customization is always needed to make kylin work, then why this issue occurred in my environment?
Where can I get the correct core-site.xml, can I copy it from my Hadoop environment? Or what specific conf items you specially care about to judge its correctness, say 'fs.defaultFS'? 发自我的 iPhone > 在 2017年2月5日,21:38,ShaoFeng Shi <[email protected]> 写道: > > No, from the output of "hdfs fs -ls" the file does exist on HDFS; In hive's > context its default fs is HDFS, so it works as expected. The behavior of > Kylin is wrong as it goes to "RawLocalFileSystem". Usually this is because > the core-site.xml is wrongly configured or it is absent from Kylin's > classpath I think. A quick workaround can be manually finding the right > core-site.xml and then copy it to KYLIN_HOME/conf, restart and then resume > the job. Please take a try and let us know whether it can solve. (we > haven't tried HDP 2.5) > > 2017-02-05 21:22 GMT+08:00 磊 王 <[email protected]>: > >> Thanks for your reply. >> I see the log shows it firstly moved that file from hdfs to local, and >> then read from local. But the issue is the file to move in hdfs does not >> exist at the step of move, so the local file is not available to read at >> the step of read. >> Is my understanding correct? >> >> 发自我的 iPhone >> >>> 在 2017年2月5日,20:54,ShaoFeng Shi <[email protected]> 写道: >>> >>> From the error it tried to seek the file from local file system, instead >> of >>> HDFS. Please check whethere "fs.DefaultFS" in your environment was set to >>> local file by mistake. >>> >>> >>> 2017-02-04 22:58 GMT+08:00 ? ? <[email protected]>: >>> >>>> Hi Sir, >>>> >>>> When I built the sample cube (and my own cube), I met the error as >> below. >>>> It seems the issue is at the moving step, because I confirmed there was >>>> ‘hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata/ >>>> kylin-6a392cfd-a903-4763-89cf-1ce44302c394/row_count/000000_0’, but >> Kylin >>>> was trying to move is hdfs://sandbox.hortonworks. >>>> com:8020/kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf- >>>> 1ce44302c394/row_count/.hive-staging_hive_2017-02-04_09-38- >>>> 21_841_1217104325959051054-5/-ext-10000. >>>> >>>> [root@ip-10-9-255-49 ec2-user]# hdfs dfs -ls >> hdfs://sandbox.hortonworks. >>>> com:8020/kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf- >>>> 1ce44302c394/row_count >>>> Found 1 items >>>> -rwxr-xr-x 1 hive hdfs 3 2017-02-04 09:38 hdfs:// >>>> sandbox.hortonworks.com:8020/kylin/kylin_metadata/ >>>> kylin-6a392cfd-a903-4763-89cf-1ce44302c394/row_count/000000_0 >>>> >>>> Environment: >>>> Kylin 1.6.0 + HDP 2.5 >>>> I am not sure if HDP 2.5 is a too high version, because I only see HDP >> 2.4 >>>> is referred in Kylin doc. >>>> >>>> >>>> Build error: >>>> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] >>>> manager.ExecutableManager:292 : job id:6a392cfd-a903-4763-89cf- >> 1ce44302c394-01 >>>> from READY to RUNNING >>>> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] hive.HiveCmdBuilder:81 : >>>> The statements to execute in beeline: >>>> USE default; >>>> SET hive.exec.compress.output=true; >>>> SET hive.auto.convert.join.noconditionaltask=true; >>>> SET hive.auto.convert.join.noconditionaltask.size=100000000; >>>> SET mapreduce.output.fileoutputformat.compress.type=BLOCK; >>>> SET mapreduce.job.split.metainfo.maxsize=-1; >>>> >>>> set hive.exec.compress.output=false; >>>> >>>> set hive.exec.compress.output=false; >>>> INSERT OVERWRITE DIRECTORY '/kylin/kylin_metadata/kylin- >>>> 6a392cfd-a903-4763-89cf-1ce44302c394/row_count' SELECT count(*) FROM >>>> kylin_intermediate_kylin_sales_cube_desc_0eb9faec_6a22_ >>>> 4b49_889f_c283d82d72dd; >>>> >>>> >>>> 2017-02-04 09:38:17,947 DEBUG [pool-5-thread-2] hive.HiveCmdBuilder:83 : >>>> The SQL to execute in beeline: >>>> >>>> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : Compute row count of flat hive table, >>>> cmd: >>>> 2017-02-04 09:38:17,947 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : beeline -n root -u 'jdbc:hive2:// >>>> sandbox.hortonworks.com:2181/;serviceDiscoveryMode= >>>> zooKeeper;zooKeeperNamespace=hiveserver2' -f >>>> /root/apache-kylin-1.6.0-hbase1.x-bin/bin/../tomcat/temp/beeline_ >> 3013188987573907903.hql;rm >>>> -f /root/apache-kylin-1.6.0-hbase1.x-bin/bin/../tomcat/temp/beeline_ >>>> 3013188987573907903.hql >>>> 2017-02-04 09:38:19,832 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : WARNING: Use "yarn jar" to launch >> YARN >>>> applications. >>>> 2017-02-04 09:38:20,342 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : Connecting to jdbc:hive2://sandbox. >>>> hortonworks.com:2181/;serviceDiscoveryMode= >> zooKeeper;zooKeeperNamespace= >>>> hiveserver2 >>>> 2017-02-04 09:38:21,629 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : Connected to: Apache Hive (version >>>> 1.2.1000.2.5.0.0-1245) >>>> 2017-02-04 09:38:21,630 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : Driver: Hive JDBC (version >>>> 1.2.1.2.3.2.0-2950) >>>> 2017-02-04 09:38:21,630 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : Transaction isolation: >>>> TRANSACTION_REPEATABLE_READ >>>> 2017-02-04 09:38:21,679 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> USE default; >>>> 2017-02-04 09:38:21,732 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.051 seconds) >>>> 2017-02-04 09:38:21,746 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> SET hive.exec.compress.output=true; >>>> 2017-02-04 09:38:21,757 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.011 seconds) >>>> 2017-02-04 09:38:21,766 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> SET hive.auto.convert.join. >> noconditionaltask=true; >>>> 2017-02-04 09:38:21,768 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.002 seconds) >>>> 2017-02-04 09:38:21,777 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> SET hive.auto.convert.join. >> noconditionaltask.size= >>>> 100000000; >>>> 2017-02-04 09:38:21,779 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.002 seconds) >>>> 2017-02-04 09:38:21,787 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> SET mapreduce.output.fileoutputformat.compress. >>>> type=BLOCK; >>>> 2017-02-04 09:38:21,790 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.003 seconds) >>>> 2017-02-04 09:38:21,796 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> SET mapreduce.job.split.metainfo.maxsize=-1; >>>> 2017-02-04 09:38:21,801 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.005 seconds) >>>> 2017-02-04 09:38:21,804 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> >>>> 2017-02-04 09:38:21,808 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> set hive.exec.compress.output=false; >>>> 2017-02-04 09:38:21,811 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.003 seconds) >>>> 2017-02-04 09:38:21,812 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> >>>> 2017-02-04 09:38:21,816 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> set hive.exec.compress.output=false; >>>> 2017-02-04 09:38:21,818 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (0.002 seconds) >>>> 2017-02-04 09:38:21,833 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> INSERT OVERWRITE DIRECTORY >>>> '/kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf-1c >>>> e44302c394/row_count' SELECT count(*) FROM k >>>> 2017-02-04 09:38:21,840 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : ylin_intermediate_kylin_sales_ >>>> cube_desc_0eb9faec_6a22_4b49_889f_c283d82d72dd; >>>> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Tez session hasn't been >> created >>>> yet. Opening session >>>> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Dag name: INSERT OVERWRITE >>>> DIRE...49_889f_c283d82d72dd(Stage-1) >>>> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : >>>> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : >>>> 2017-02-04 09:38:30,452 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Status: Running (Executing on >>>> YARN cluster with App id application_1486198519344_0004) >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Map 1: -/- Reducer 2: 0/1 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Map 1: 0/1 Reducer 2: 0/1 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Map 1: 0(+1)/1 Reducer >>>> 2: 0/1 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Map 1: 1/1 Reducer 2: 0/1 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Map 1: 1/1 Reducer 2: >> 0(+1)/1 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Map 1: 1/1 Reducer 2: 1/1 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : INFO : Moving data to directory >>>> /kylin/kylin_metadata/kylin-6a392cfd-a903-4763-89cf- >> 1ce44302c394/row_count >>>> from hdfs://sandbox.h >>>> ortonworks.com:8020/kylin/kylin_metadata/kylin-6a392cfd- >>>> a903-4763-89cf-1ce44302c394/row_count/.hive-staging_hive_ >>>> 2017-02-04_09-38-21_841_1217104325959051054-5/-ext-10000 >>>> 2017-02-04 09:38:30,453 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : No rows affected (8.613 seconds) >>>> 2017-02-04 09:38:30,457 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> >>>> 2017-02-04 09:38:30,457 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/> >>>> 2017-02-04 09:38:30,458 INFO [pool-5-thread-2] >>>> execution.AbstractExecutable:36 : Closing: 0: jdbc:hive2://sandbox. >>>> hortonworks.com:2181/;serviceDiscoveryMode= >> zooKeeper;zooKeeperNamespace= >>>> hiveserver2 >>>> 2017-02-04 09:38:30,573 ERROR [pool-5-thread-2] >>>> execution.AbstractExecutable:370 : job:6a392cfd-a903-4763-89cf- >> 1ce44302c394-01 >>>> execute finished with exception >>>> java.io.FileNotFoundException: File /kylin/kylin_metadata/kylin- >>>> 6a392cfd-a903-4763-89cf-1ce44302c394/row_count does not exist >>>> at org.apache.hadoop.fs.RawLocalFileSystem.listStatus( >>>> RawLocalFileSystem.java:429) >>>> at org.apache.hadoop.fs.FileSystem.listStatus( >>>> FileSystem.java:1515) >>>> at org.apache.hadoop.fs.FileSystem.listStatus( >>>> FileSystem.java:1555) >>>> at org.apache.hadoop.fs.ChecksumFileSystem.listStatus( >>>> ChecksumFileSystem.java:574) >>>> at org.apache.kylin.source.hive.HiveMRInput$ >>>> RedistributeFlatHiveTableStep.doWork(HiveMRInput.java:338) >>>> at org.apache.kylin.job.execution.AbstractExecutable. >>>> execute(AbstractExecutable.java:113) >>>> at org.apache.kylin.job.execution.DefaultChainedExecutable. >> doWork( >>>> DefaultChainedExecutable.java:57) >>>> at org.apache.kylin.job.execution.AbstractExecutable. >>>> execute(AbstractExecutable.java:113) >>>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$ >>>> JobRunner.run(DefaultScheduler.java:136) >>>> >>>> >>>> >>>> Thx >>>> Lei Wang >>>> >>>> >>> >>> >>> -- >>> Best regards, >>> >>> Shaofeng Shi 史少锋 >> > > > > -- > Best regards, > > Shaofeng Shi 史少锋
