[ 
https://issues.apache.org/jira/browse/KYLIN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002995#comment-17002995
 ] 

 Kaige Liu commented on KYLIN-3685:
-----------------------------------

Hi [~rjarvis], [~rongneng.wei],

There is a solution to this issue. You can give it a shot as below steps:

1) Use beeline instead of Hive CLI to connect Hive metastore.

    Change configurations in kylin.properties
{quote}kylin.source.hive.client=beeline

kylin.source.hive.beeline-params=-u 
jdbc:hive2://ip-172-31-84-101.ec2.internal:10000 -n root
{quote}
2) copy missed jars
{quote}cp 
/usr/share/aws/hmclient/lib/aws-glue-datacatalog-hive2-client-1.11.0.jar 
$KYLIN_HOME/ext

cp $KYLIN_HOME/spark/jars/joda-time-2.9.3.jar $KYLIN_HOME/lib
{quote}
I have tried this on AWS EMR 5.28. It works well.

 

_*Root cause analysis*_

1. Kylin connects Hive metastore via HiveMetaStoreClient like this:
{code:java}
private HiveMetaStoreClient getMetaStoreClient() throws Exception {
    if (metaStoreClient == null) {
        metaStoreClient = new HiveMetaStoreClient(hiveConf);
    }
    return metaStoreClient;
}
{code}
This will ignore the configurations in hive-site.xml cause it initializes the 
client directly.
{quote}<property>

    <name>hive.metastore.client.factory.class</name>

    
<value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>

  </property>
{quote}
When changing to beeline, the client will not be created by kylin and beeline 
can handle this properly.

 

2. We need to add  
/usr/share/aws/hmclient/lib/aws-glue-datacatalog-hive2-client-1.11.0.jar to 
classpath to avoid below error:
{quote}java.lang.RuntimeException: java.io.IOException: 
MetaException(message:Unable to instantiate a metastore client factory 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
to: java.lang.ClassNotFoundException: Class 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
found)
 at 
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
 at 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
 at 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
 at 
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:144)
 at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
 at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
 at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
 at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
{quote}
2. Why do we need to copy joda-time-2.9.3.jar to $KYLIN_HOME/lib?

AWS java SDK uses a newer version of joda-time while hbase introduces an old 
version joda-time( < 2.0 ) shipped with jruby-complete-1.6.8.jar . Putting the 
new version to $KYLIN_HOME/lib so that it will appear in front of 
jruby-complete-1.6.8.jar in classpath.

If not, below error will occur
{quote}org.apache.kylin.job.exception.ExecuteException: 
org.apache.kylin.job.exception.ExecuteException: 
com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: 
org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/format/DateTimeFormatter;

        at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:194)

        at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.kylin.job.exception.ExecuteException: 
com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: 
org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/format/DateTimeFormatter;
{quote}
 

 

> AWS Glue Catalog Not Supported
> ------------------------------
>
>                 Key: KYLIN-3685
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3685
>             Project: Kylin
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: v2.5.0
>            Reporter: Richard Jarvis
>            Assignee:  Kaige Liu
>            Priority: Major
>
> I am trying to use Kylin on AWS (EMR 5.18.0).
> I use AWS Glue as the catalog and as a result Kylin can't find the tables. 
> I am able to see the schemas and tables in the GUI because I have set the AWS 
> glue properties in hive-site.xml:
>  
> <property>
> <name>hive.metastore.client.factory.class</name>    
> <value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory</value>
> </property>
> However, the job 
> org.apache.kylin.source.hive.cardinality.HiveColumnCardinalityJob fails to 
> find the tables (it's looking in the Hive metadata catalog instead of AWS 
> Glue).
> I think this is because Hive 1.2.1 is too old to support the client factory 
> class.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to