I had not tried to do the upgrade Spark since I assumed it will fail
because of the transitive dependencies of Hive.

But I decided to give it a shot today. Luckily the Spark code base is quite
Avro friendly so codewise it was 'easy'.

Of course it is still failing, but you can use that to refer on the other
PR.
And if you can find any fixes to the pending things that would be great.

https://github.com/apache/spark/pull/27609

Regards,
Ismaël


On Fri, Feb 14, 2020 at 5:42 PM Michael Heuer <heue...@gmail.com> wrote:

> Hello Ismaël,
>
> Might you be able to share a link to your patch for Spark?  I would like
> to try to apply it on top of
>
> https://github.com/apache/spark/pull/26804 <
> https://github.com/apache/spark/pull/26804>
>
> which attempts to upgrade the Parquet dependency for Spark to 1.11.0.
>
> Thank you,
>
>    michael
>
>
> > On Feb 14, 2020, at 10:30 AM, Ismaël Mejía <ieme...@gmail.com> wrote:
> >
> > Ah lovely question.
> >
> > tldr; version
> > Spark depends on Hive so Hive should be upgraded first
> > Spark depends on two versions of Hive a fork by Spark of 1.x and upstream
> > Hive 2.x
> > Upgrading the first is not even discussed at the moment, for the second I
> > added a patch that passes all tests if you run it against Spark
> 2.4/master,
> > but Hive uses a forked version of Spark 2.3 to run its tests (YES
> CIRCULAR
> > DEPENDENCY!!!)
> >
> > One extra point that is pushing things in the right direction is that
> > Parquet and Iceberg already moved to Avro 1.9.x so pressure is growing
> for
> > things to move, but it is still is a mess, but we want to give the fight,
> > one thing is sure it won't be for Spark 3.0.0, best case 3.1.x and that
> > also depends on the good will of the Hive contributors that have ignored
> my
> > emails + patches for some time.
> >
> https://lists.apache.org/thread.html/rc6c672ad4a5e255957d54d80ff83bf48eacece2828a86bc6cedd9c4c%40%3Cdev.hive.apache.org%3E
> >
> > For the detailed details on the saga:
> > https://issues.apache.org/jira/browse/SPARK-27733
> > https://issues.apache.org/jira/browse/HIVE-21737
> >
> >
> > On Fri, Feb 14, 2020 at 5:04 PM Michael Heuer <heue...@gmail.com> wrote:
> >
> >> Hello,
> >>
> >> I wonder if any Avro devs might be willing to help push a PR for Apache
> >> Spark to update the Avro dependency from 1.8.2 to 1.9.2?
> >>
> >> I foresee some trouble with binary incompatible code changes and
> >> dependency version conflicts, and could use some additional support.
> >>
> >> Thank you in advance,
> >>
> >>   michael
>
>

Reply via email to