[
https://issues.apache.org/jira/browse/HUDI-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272397#comment-17272397
]
Vinoth Govindarajan commented on HUDI-824:
------------------------------------------
[~nagarwal] - All the apache projects are available directly to use with
`–packages` option, I tried with pyspark it worked:
{code:java}
spark-shell \
--packages
org.apache.hudi:hudi-spark-bundle_2.12:0.7.0,org.apache.spark:spark-avro_2.12:3.0.1
\
--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'
{code}
The same instructions have been updated in the following doc:
[https://hudi.apache.org/docs/quick-start-guide.html]
No further action need, let me know if its okay to close this issue.
> Register hudi-spark package with spark packages repo for easier usage of Hudi
> -----------------------------------------------------------------------------
>
> Key: HUDI-824
> URL: https://issues.apache.org/jira/browse/HUDI-824
> Project: Apache Hudi
> Issue Type: Bug
> Components: Spark Integration
> Reporter: Nishith Agarwal
> Assignee: Vinoth Govindarajan
> Priority: Minor
> Labels: user-support-issues
>
> At the moment, to be able to use Hudi with spark, users have to do the
> following :
>
> {{spark-2.4.4-bin-hadoop2.7/bin/spark-shell \
> --jars `ls
> packaging/hudi-spark-bundle/target/hudi-spark-bundle_2.11-*.*.*-SNAPSHOT.jar`
> \
> --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'}}
> {{}}
> {{Ideally, we want to be able to use Hudi as follows :}}
>
> {{spark-2.4.4-bin-hadoop2.7/bin/spark-shell \ --packages
> org.apache.hudi:hudi-spark-bundle:<version> \
> --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer'}}{{}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)