[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2021-05-25 Thread Purushotham Pushpavanthar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351319#comment-17351319
 ] 

Purushotham Pushpavanthar commented on HUDI-376:


[~XingXPan] this is good catch. Could you please let me know how to form jdbc 
URL for AWS Glue metastore to use in run_sync_tool.sh tool?

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Usability
>Reporter: Xing Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2020-01-05 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008534#comment-17008534
 ] 

leesf commented on HUDI-376:


[~XingXPan] Thanks, will take a look when get a chance.

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: Xing Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2020-01-05 Thread Xing Pan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008528#comment-17008528
 ] 

Xing Pan commented on HUDI-376:
---

PR: [https://github.com/apache/incubator-hudi/pull/1189]
[~xleesf]

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: Xing Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2020-01-05 Thread Xing Pan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008482#comment-17008482
 ] 

Xing Pan commented on HUDI-376:
---

[~xleesf] sorry for the delay of response.

I'd like to send a PR,  but I think the script "run_sync_tool.sh" in github 
repo is different from the script in EMR.

I am not sure where the source code of EMR version of "run_sync_tool.sh" is.

But surely I can send a PR to add document of aws-configs.

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: Xing Pan
>Priority: Minor
> Fix For: 0.5.1
>
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2020-01-04 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008120#comment-17008120
 ] 

Vinoth Chandar commented on HUDI-376:
-

+1 .. we can improve the docs and it should help

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: Xing Pan
>Priority: Minor
> Fix For: 0.5.1
>
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2020-01-04 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008064#comment-17008064
 ] 

leesf commented on HUDI-376:


We could fix it quicky by documenting the dependency? just as 
https://hudi.apache.org/s3_hoodie.html#aws-configs. And add the above two jars 
under _AWS Libs_? cc [~vinoth] WDYT?

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: Xing Pan
>Priority: Minor
> Fix For: 0.5.1
>
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-376) AWS Glue dependency issue for EMR 5.28.0

2019-12-11 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994048#comment-16994048
 ] 

leesf commented on HUDI-376:


[~XingXPan] Would you please send a PR to fix it?

> AWS Glue dependency issue for EMR 5.28.0
> 
>
> Key: HUDI-376
> URL: https://issues.apache.org/jira/browse/HUDI-376
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: CLI
>Reporter: Xing Pan
>Priority: Minor
> Fix For: 0.5.1
>
>
> Hi hudi team, it's really encouraging that Hudi is finally officially 
> supported application on AWS EMR. Great job!
> I found a *ClassNotFound* exception when using:
> {code:java}
> /usr/lib/hudi/bin/run_sync_tool.sh
> {code}
> in emr master.
> And I think is due to demand of aws glue data sdk dependency. (I used aws 
> glue as hive meta data)
> So I added a line to run_sync_tool.sh to get a quick fix for this:
> {code:java}
> HIVE_JARS=$HIVE_JARS:/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar:/usr/share/aws/emr/emr-metrics-collector/lib/aws-java-sdk-glue-1.11.475.jar{code}
> not sure if any more jars needed, but these two jar fixed my problem.
>  
> I think it would be great if take glue in consideration for emr scripts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)