[
https://issues.apache.org/jira/browse/KYLIN-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039828#comment-17039828
]
wangrupeng commented on KYLIN-4370:
-----------------------------------
Sorry repply late, I tried a lot and finnally find the reason. It's because
kylin spark build engine cannot read hive table file saved in HDFS which is
produced by sqoop because it' s not a sequence file and then it will directly
get hive table source using spark sql, but spark sql didn't init by hive
config, so finnaly get "table or view not found exception"
!image-2020-02-19-17-20-20-076.png|width=739,height=112!
I think it is because spark sql cannot get hive configuraion in cluster mode,
but I did add hive-site.xml to the classpath in my cluster. Now I can avoid
this problem by add some code, but haven't find a way without changing code.
I'm working on it.
> Spark job failing with JDBC source on 8th step with error :
> org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Table or view
> not found: `default`.`kylin_intermediate table'
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: KYLIN-4370
> URL: https://issues.apache.org/jira/browse/KYLIN-4370
> Project: Kylin
> Issue Type: Bug
> Affects Versions: v3.0.0
> Reporter: Sonu Singh
> Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-02-19-17-20-20-076.png
>
>
> 020-02-03 11:18:45,899 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:45 INFO Client:54 - Application report for
> application_1580106368736_0431 (state: RUNNING)
> 2020-02-03 11:18:46,901 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46 INFO Client:54 - Application report for
> application_1580106368736_0431 (state: FINISHED)
> 2020-02-03 11:18:46,906 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46 INFO Client:54 -
> 2020-02-03 11:18:46,906 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> client token: N/A
> 2020-02-03 11:18:46,906 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> diagnostics: User class threw exception: java.lang.RuntimeException: error
> execute org.apache.kylin.engine.spark.SparkCubingByLayer. Root cause: Table
> or view not found:
> `default`.`kylin_intermediate_cube_2_30012020_spark_ded81d01_32e9_4f82_ea97_61788fdfd59b`;;
> 2020-02-03 11:18:46,906 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 'UnresolvedRelation
> `default`.`kylin_intermediate_cube_2_30012020_spark_ded81d01_32e9_4f82_ea97_61788fdfd59b`
> 2020-02-03 11:18:46,906 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46,906 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:34)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:36)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> Caused by: org.apache.spark.sql.AnalysisException: Table or view not found:
> `default`.`kylin_intermediate_cube_2_30012020_spark_ded81d01_32e9_4f82_ea97_61788fdfd59b`;;
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 'UnresolvedRelation
> `default`.`kylin_intermediate_cube_2_30012020_spark_ded81d01_32e9_4f82_ea97_61788fdfd59b`
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:86)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:84)
> 2020-02-03 11:18:46,907 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:84)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:92)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.SparkSession.table(SparkSession.scala:628)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.sql.SparkSession.table(SparkSession.scala:624)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.kylin.engine.spark.SparkUtil.getOtherFormatHiveInput(SparkUtil.java:163)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.kylin.engine.spark.SparkUtil.hiveRecordInputRDD(SparkUtil.java:144)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:161)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:29)
> 2020-02-03 11:18:46,908 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> ... 6 more
> 2020-02-03 11:18:46,909 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46,909 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> ApplicationMaster host: 172.31.43.179
> 2020-02-03 11:18:46,909 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> ApplicationMaster RPC port: 0
> 2020-02-03 11:18:46,909 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> queue: default
> 2020-02-03 11:18:46,912 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> start time: 1580728698758
> 2020-02-03 11:18:46,912 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> final status: FAILED
> 2020-02-03 11:18:46,912 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> tracking URL: http://XXXXXXXX:8088/proxy/application_1580106368736_0431/
> 2020-02-03 11:18:46,918 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> user: root
> 2020-02-03 11:18:46,918 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> Exception in thread "main" org.apache.spark.SparkException: Application
> application_1580106368736_0431 finished with failed status
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.yarn.Client.run(Client.scala:1165)
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520)
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
> 2020-02-03 11:18:46,919 INFO [pool-25-thread-1] spark.SparkExecutable:33 : at
> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 2020-02-03 11:18:46,929 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46 INFO ShutdownHookManager:54 - Shutdown hook called
> 2020-02-03 11:18:46,931 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46 INFO ShutdownHookManager:54 - Deleting directory
> /tmp/spark-a13c48c9-967e-422e-b541-aba9d55fcd7c
> 2020-02-03 11:18:46,940 INFO [pool-25-thread-1] spark.SparkExecutable:33 :
> 2020-02-03 11:18:46 INFO ShutdownHookManager:54 - Deleting directory
--
This message was sent by Atlassian Jira
(v8.3.4#803005)