rongneng.wei created KYLIN-4206:
-----------------------------------
Summary: Build kylin on EMR 5.23. The kylin version is 2.6.4. When
building the cube, the hive table cannot be found
Key: KYLIN-4206
URL: https://issues.apache.org/jira/browse/KYLIN-4206
Project: Kylin
Issue Type: Bug
Components: Environment
Affects Versions: v2.6.4
Environment: EMR 5.23(hadoop 2.8.5\HBase 1.4.9\hive 2.3.4\Spark
2.4.0\Tez 0.9.1\HCatalog 2.3.4\Zookeeper 3.4.13)
kylin 2.6.4
Reporter: rongneng.wei
Attachments: kylin.properties, kylin_hive_conf.xml, kylin_job_conf.xml
hi,
I Build kylin on EMR 5.23. The kylin version is 2.6.4.When building the
cube, the hive table cannot be found.The detailed error information is as
follows:
java.lang.RuntimeException: java.io.IOException:
NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
table not found)java.lang.RuntimeException: java.io.IOException:
NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
table not found) at
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
at
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
at
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
at
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
On the EMR, hive metadata is shared by glue, and the URL of Metastore is
configured in hive-site.xml.
<name>hive.metastore.uris</name>
<value>thrift://ip-172-40-15-164.ec2.internal:9083</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.client.factory.class</name>
<value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
</property>
But when I use hive's own metadata, that is, don't use glue to share metadata,
the above exception will not occur, comment out the following configuration.
<!--<property>
<name>hive.metastore.client.factory.class</name>
<value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
</property>
-->
But since EMR uses shared metadata, if you don't use metadata sharing, then I
can't query other hive tables built by the cluster.
The configuration file is detailed in the attachment. Please help me solve this
problem.Thank you。
Best regard.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)