[ https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008482#comment-17008482 ]
Xing Pan commented on HUDI-376: ------------------------------- [~xleesf] sorry for the delay of response. I'd like to send a PR, but I think the script "run_sync_tool.sh" in github repo is different from the script in EMR. I am not sure where the source code of EMR version of "run_sync_tool.sh" is. But surely I can send a PR to add document of aws-configs. > AWS Glue dependency issue for EMR 5.28.0 > ---------------------------------------- > > Key: HUDI-376 > URL: https://issues.apache.org/jira/browse/HUDI-376 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: Usability > Reporter: Xing Pan > Priority: Minor > Fix For: 0.5.1 > > > Hi hudi team, it's really encouraging that Hudi is finally officially > supported application on AWS EMR. Great job! > I found a *ClassNotFound* exception when using: > {code:java} > /usr/lib/hudi/bin/run_sync_tool.sh > {code} > in emr master. > And I think is due to demand of aws glue data sdk dependency. (I used aws > glue as hive meta data) > So I added a line to run_sync_tool.sh to get a quick fix for this: > {code:java} > HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code} > not sure if any more jars needed, but these two jar fixed my problem. > > I think it would be great if take glue in consideration for emr scripts. -- This message was sent by Atlassian Jira (v8.3.4#803005)