Udit Mehrotra created HUDI-516:
----------------------------------
Summary: Avoid need to import spark-avro package when submitting
Hudi job in spark
Key: HUDI-516
URL: https://issues.apache.org/jira/browse/HUDI-516
Project: Apache Hudi (incubating)
Issue Type: Improvement
Components: Usability
Reporter: Udit Mehrotra
We are in the process of migrating Hudi to *spark 2.4.4* and using *spark-avro*
instead of the deprecated *databricks-avro* here
[https://github.com/apache/incubator-hudi/pull/1005/]
After this change, users would be required to specifically download spark-avro
while start spark-shell using:
{code:java}
--packages org.apache.spark:spark-avro_2.11:2.4.4
{code}
This is because we are not shading this now in *hudi-spark-bundle*. One reason
for not shading this is because we are not sure of the implications of shading
a spark dependency in a jar which is being submitted to spark. [~vinoth]
pointed out that a possible concern could be that we will always be shading
spark-avro 2.4.4 which can affect users using higher versions of Spark.
This Jira is to come up with a way to solve this usability issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)