Ah lovely question. tldr; version Spark depends on Hive so Hive should be upgraded first Spark depends on two versions of Hive a fork by Spark of 1.x and upstream Hive 2.x Upgrading the first is not even discussed at the moment, for the second I added a patch that passes all tests if you run it against Spark 2.4/master, but Hive uses a forked version of Spark 2.3 to run its tests (YES CIRCULAR DEPENDENCY!!!)
One extra point that is pushing things in the right direction is that Parquet and Iceberg already moved to Avro 1.9.x so pressure is growing for things to move, but it is still is a mess, but we want to give the fight, one thing is sure it won't be for Spark 3.0.0, best case 3.1.x and that also depends on the good will of the Hive contributors that have ignored my emails + patches for some time. https://lists.apache.org/thread.html/rc6c672ad4a5e255957d54d80ff83bf48eacece2828a86bc6cedd9c4c%40%3Cdev.hive.apache.org%3E For the detailed details on the saga: https://issues.apache.org/jira/browse/SPARK-27733 https://issues.apache.org/jira/browse/HIVE-21737 On Fri, Feb 14, 2020 at 5:04 PM Michael Heuer <heue...@gmail.com> wrote: > Hello, > > I wonder if any Avro devs might be willing to help push a PR for Apache > Spark to update the Avro dependency from 1.8.2 to 1.9.2? > > I foresee some trouble with binary incompatible code changes and > dependency version conflicts, and could use some additional support. > > Thank you in advance, > > michael