[ 
https://issues.apache.org/jira/browse/KYLIN-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997152#comment-16997152
 ] 

rongneng.wei commented on KYLIN-4206:
-------------------------------------

This modification mainly solves the problem of aw glue catalog supported by 
kylin, and the associated jira is 
[https://issues.apache.org/jira/browse/KYLIN-4206](https://issues.apache.org/jira/browse/KYLIN-4206)。
1.First you need to modify the aws-glue-data-catalog-client  source code.
aws-glue-data-catalog-client-for-apache-hive-metastore github address is 
[https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore],aws-glue-client
 development environment see README.MD.
I downloaded hive 2.3.7 locally, so after following the steps in the 
[README.MD|http://readme.md/] file, the version of hive is 2.3.7-SNAPSHOT.
1)Modify the pom.xml file in the home directory.
<hive2.version>2.3.7-SNAPSHOT</hive2.version> 
<spark-hive.version>1.2.1.spark2</spark-hive.version>
2)Modify the class 
ofaws-glue-datacatalog-hive2-client/com.amazonaws.glue.catalog.metastore.AWSCatalogMetastoreClient
 
!uivjZlJlPOeuMSgo5fBDLgafqUuj4OS9IlxP6txY 
pbQFAUFAEBAEBAFBQBAQBBJAoLCwELQPAwl4iosgcO8j8C88vGNb4sVz1wAAAABJRU5ErkJggg==!
Implementation method 
@Override
  public PartitionValuesResponse listPartitionValues (PartitionValuesRequest 
partitionValuesRequest) throws MetaException, TException, NoSuchObjectException 
{
    return null;
  }
!0oukQAJkEAnEIipg FOqC 
LIAESIAESIAESIAESIAESIAESIAESIIFuTYBiTbfevWwcCZAACZAACZAACZAACZAACZAACZDA9UaAYs31tsdYXxIgARIgARIgARIgARIgARIgARIggW5NgGJNt969bBwJkAAJkAAJkAAJkAAJkAAJkAAJkMD1RoBizfW2x1hfEiABEiABEiABEiABEiABEiABEiCBbk2AYk233r1sHAmQAAmQAAmQAAmQAAmQAAmQAAmQwPVGgGLN9bbHWF8SIAESIAESIAESIAESIAESIAESIIFuTYBiTbfevWwcCZAACZAACZAACZAACZAACZAACZDA9UaAYs31tsdYXxIgARIgARIgARIgARIgARIgARIggW5N4H8ADheQXh
 AqTUAAAAASUVORK5CYII=!
 
3)Modify the class 
ofaws-glue-datacatalog-spark-client/com.amazonaws.glue.catalog.metastore.AWSCatalogMetastoreClient.
The problems are as follows:
!w8byaQgJC3UKQAAAABJRU5ErkJggg==!
This method is not available in the parent class,so delete the method,Then copy 
the method of aws-glue-datacatalog-hive2-client / 
com.amazonaws.glue.catalog.metastore.AWSCatalogMetastoreClient.Add dependency 
in aws-glue-datacatalog-spark-client / pom.xml file
<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-exec</artifactId>
    <version>${hive2.version}</version>
    <scope>provided</scope>
</dependency>
4)Package,need to package three projects,as follows.
!9k=!
!2Q==!
!Z!
5).Copy the three package
aws-glue-datacatalog-client-common-1.10.0-SNAPSHOT.jar  
aws-glue-datacatalog-hive2-client-1.10.0-SNAPSHOT.jar  
aws-glue-datacatalog-spark-client-1.10.0-SNAPSHOT.jar
to /kylin/lib
 
2.*Modify the source code of kylin,See submission of PR.*
1)Add the gluecatalog in the config  of  kylin.properties. 
##The default access HiveMetastoreClient is hcatalog. If AWS user and glue 
catalog is used, it can be configured as gluecatalog
##kylin.source.hive.metadata-type=hcatalog
The default is hcatalog. If you want to use glue, please configure 
kylin.source.hive.metadata-type = gluecatalog.
if config gluecatalog,so need to configure in hive-site.xml,as follows:
  <property>
    <name>hive.metastore.client.factory.class</name>    
<value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
  </property>
3.install  on EMR 

> Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, 
> the hive table cannot be found
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4206
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4206
>             Project: Kylin
>          Issue Type: Bug
>          Components: Environment 
>    Affects Versions: v2.6.4
>         Environment: EMR 5.23(hadoop 2.8.5\HBase 1.4.9\hive 2.3.4\Spark 
> 2.4.0\Tez 0.9.1\HCatalog 2.3.4\Zookeeper 3.4.13)
> kylin 2.6.4
>            Reporter: rongneng.wei
>            Priority: Major
>         Attachments: kylin.properties, kylin_hive_conf.xml, kylin_job_conf.xml
>
>
> hi,
>    I  Build kylin on EMR 5.23. The kylin version is 2.6.4.When building the 
> cube, the hive table cannot be found.The detailed error information is as 
> follows:
> java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found)java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found) at 
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
>  at 
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> On the EMR, hive metadata is shared by glue, and the URL of Metastore is 
> configured in hive-site.xml.
> <name>hive.metastore.uris</name>
>  <value>thrift://ip-172-40-15-164.ec2.internal:9083</value>
>  <description>JDBC connect string for a JDBC metastore</description>
>  </property>
> <property>
>  <name>hive.metastore.client.factory.class</name>
>  
> <value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
>  </property>
> But when I use hive's own metadata, that is, don't use glue to share 
> metadata, the above exception will not occur, comment out the following 
> configuration.
> <!--<property>
> <name>hive.metastore.client.factory.class</name>
> <value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
> </property>
> -->
> But since EMR uses shared metadata, if you don't use metadata sharing, then I 
> can't query other hive tables built by the cluster.
> The configuration file is detailed in the attachment. Please help me solve 
> this problem.Thank you。
> Best regard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to