Ah lovely question.

tldr; version
Spark depends on Hive so Hive should be upgraded first
Spark depends on two versions of Hive a fork by Spark of 1.x and upstream
Hive 2.x
Upgrading the first is not even discussed at the moment, for the second I
added a patch that passes all tests if you run it against Spark 2.4/master,
but Hive uses a forked version of Spark 2.3 to run its tests (YES CIRCULAR
DEPENDENCY!!!)

One extra point that is pushing things in the right direction is that
Parquet and Iceberg already moved to Avro 1.9.x so pressure is growing for
things to move, but it is still is a mess, but we want to give the fight,
one thing is sure it won't be for Spark 3.0.0, best case 3.1.x and that
also depends on the good will of the Hive contributors that have ignored my
emails + patches for some time.
https://lists.apache.org/thread.html/rc6c672ad4a5e255957d54d80ff83bf48eacece2828a86bc6cedd9c4c%40%3Cdev.hive.apache.org%3E

For the detailed details on the saga:
https://issues.apache.org/jira/browse/SPARK-27733
https://issues.apache.org/jira/browse/HIVE-21737


On Fri, Feb 14, 2020 at 5:04 PM Michael Heuer <heue...@gmail.com> wrote:

> Hello,
>
> I wonder if any Avro devs might be willing to help push a PR for Apache
> Spark to update the Avro dependency from 1.8.2 to 1.9.2?
>
> I foresee some trouble with binary incompatible code changes and
> dependency version conflicts, and could use some additional support.
>
> Thank you in advance,
>
>    michael

Reply via email to