[
https://issues.apache.org/jira/browse/SPARK-20958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034950#comment-16034950
]
Ryan Blue commented on SPARK-20958:
-----------------------------------
I don't think it is a good idea to roll back. Spark doesn't depend on
parquet-avro, where the update to Avro 1.8.1 was made, except for tests where
it is fine. The backports for Spark in 1.8.2 are worth keeping since there are
reasonable work-arounds in user projects.
The problem that I've seen on the dev list is when users add parquet-avro to
their dependencies and the version gets managed to 1.8.2. That will require
Avro 1.8.1 because parquet-avro calls {{getSchema}} on avro-specific objects.
But there are a couple reasonable ways to deal with this:
1. Specify a dependency on parquet-avro 1.8.1 that still uses Avro 1.7.x.
Parquet is backward-compatible with older binaries, so parquet-avro 1.8.1 works
fine with parquet-hadoop 1.8.2. (This is the recommended work-around.)
2. Shade and relocate Avro 1.8.1 in application Jars, so that Spark can use
1.7.x and parquet-avro can use 1.8.1.
This was brought up on the dev list, but the user dismissed these work-arounds
without trying them.
Long-term, we can do a 1.8.3 release to solve this problem, though I think the
best solution there would be to stop using {{getSchema}} instead of downgrading
the dependency.
> Roll back parquet-mr 1.8.2 to parquet-1.8.1
> -------------------------------------------
>
> Key: SPARK-20958
> URL: https://issues.apache.org/jira/browse/SPARK-20958
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Cheng Lian
> Assignee: Cheng Lian
>
> We recently realized that parquet-mr 1.8.2 used by Spark 2.2.0-rc2 depends on
> avro 1.8.1, which is incompatible with avro 1.7.6 used by parquet-mr 1.8.1
> and avro 1.7.7 used by spark-core 2.2.0-rc2.
> Basically, Spark 2.2.0-rc2 introduced two incompatible versions of avro
> (1.7.7 and 1.8.1). Upgrading avro 1.7.7 to 1.8.1 is not preferable due to the
> reasons mentioned in [PR
> #17163|https://github.com/apache/spark/pull/17163#issuecomment-286563131].
> Therefore, we don't really have many choices here and have to roll back
> parquet-mr 1.8.2 to 1.8.1 to resolve this dependency conflict.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]