[
https://issues.apache.org/jira/browse/SPARK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-32385:
----------------------------------
Affects Version/s: (was: 2.4.6)
(was: 3.0.0)
3.1.0
> Publish a "bill of materials" (BOM) descriptor for Spark with correct
> versions of various dependencies
> ------------------------------------------------------------------------------------------------------
>
> Key: SPARK-32385
> URL: https://issues.apache.org/jira/browse/SPARK-32385
> Project: Spark
> Issue Type: Improvement
> Components: Build
> Affects Versions: 3.1.0
> Reporter: Vladimir Matveev
> Priority: Major
>
> Spark has a lot of dependencies, many of them very common (e.g. Guava,
> Jackson). Also, versions of these dependencies are not updated as frequently
> as they are released upstream, which is totally understandable and natural,
> but which also means that often Spark has a dependency on a lower version of
> a library, which is incompatible with a higher, more recent version of the
> same library. This incompatibility can manifest in different ways, e.g as
> classpath errors or runtime check errors (like with Jackson), in certain
> cases.
>
> Spark does attempt to "fix" versions of its dependencies by declaring them
> explicitly in its {{pom.xml}} file. However, this approach, being somewhat
> workable if the Spark-using project itself uses Maven, breaks down if another
> build system is used, like Gradle. The reason is that Maven uses an
> unconventional "nearest first" version conflict resolution strategy, while
> many other tools like Gradle use the "highest first" strategy which resolves
> the highest possible version number inside the entire graph of dependencies.
> This means that other dependencies of the project can pull a higher version
> of some dependency, which is incompatible with Spark.
>
> One example would be an explicit or a transitive dependency on a higher
> version of Jackson in the project. Spark itself depends on several modules of
> Jackson; if only one of them gets a higher version, and others remain on the
> lower version, this will result in runtime exceptions due to an internal
> version check in Jackson.
>
> A widely used solution for this kind of version issues is publishing of a
> "bill of materials" descriptor (see here:
> [https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html]
> and here:
> [https://docs.gradle.org/current/userguide/platforms.html#sub:bom_import]).
> This descriptor would contain all versions of all dependencies of Spark; then
> downstream projects will be able to use their build system's support for BOMs
> to enforce version constraints required for Spark to function correctly.
>
> One example of successful implementation of the BOM-based approach is Spring:
> [https://www.baeldung.com/spring-maven-bom#spring-bom]. For different Spring
> projects, e.g. Spring Boot, there are BOM descriptors published which can be
> used in downstream projects to fix the versions of Spring components and
> their dependencies, significantly reducing confusion around proper version
> numbers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]