[
https://issues.apache.org/jira/browse/HUDI-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinoth Chandar updated HUDI-516:
--------------------------------
Fix Version/s: 0.66
> Avoid need to import spark-avro package when submitting Hudi job in spark
> -------------------------------------------------------------------------
>
> Key: HUDI-516
> URL: https://issues.apache.org/jira/browse/HUDI-516
> Project: Apache Hudi (incubating)
> Issue Type: Improvement
> Components: Usability
> Reporter: Udit Mehrotra
> Priority: Major
> Fix For: 0.66
>
>
> We are in the process of migrating Hudi to *spark 2.4.4* and using
> *spark-avro* instead of the deprecated *databricks-avro* here
> [https://github.com/apache/incubator-hudi/pull/1005/]
> After this change, users would be required to specifically download
> spark-avro while start spark-shell using:
> {code:java}
> --packages org.apache.spark:spark-avro_2.11:2.4.4
> {code}
> This is because we are not shading this now in *hudi-spark-bundle*. One
> reason for not shading this is because we are not sure of the implications of
> shading a spark dependency in a jar which is being submitted to spark.
> [~vinoth] pointed out that a possible concern could be that we will always be
> shading spark-avro 2.4.4 which can affect users using higher versions of
> Spark.
> This Jira is to come up with a way to solve this usability issue.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)