This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 76b1c122cb7d [SPARK-44115][BUILD] Upgrade Apache ORC to 2.0.0
76b1c122cb7d is described below
commit 76b1c122cb7d77e8f175b25b935b9296a669d5d8
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Fri Mar 8 13:31:10 2024 -0800
[SPARK-44115][BUILD] Upgrade Apache ORC to 2.0.0
### What changes were proposed in this pull request?
This PR aims to Upgrade Apache ORC to 2.0.0 for Apache Spark 4.0.0.
Apache ORC community has 3-year support policy which is longer than Apache
Spark. It's aligned like the following.
- Apache ORC 2.0.x <-> Apache Spark 4.0.x
- Apache ORC 1.9.x <-> Apache Spark 3.5.x
- Apache ORC 1.8.x <-> Apache Spark 3.4.x
- Apache ORC 1.7.x (Supported) <-> Apache Spark 3.3.x (End-Of-Support)
### Why are the changes needed?
**Release Note**
- https://github.com/apache/orc/releases/tag/v2.0.0
**Milestone**
- https://github.com/apache/orc/milestone/20?closed=1
- https://github.com/apache/orc/pull/1728
- https://github.com/apache/orc/issues/1801
- https://github.com/apache/orc/issues/1498
- https://github.com/apache/orc/pull/1627
- https://github.com/apache/orc/issues/1497
- https://github.com/apache/orc/pull/1509
- https://github.com/apache/orc/pull/1554
- https://github.com/apache/orc/pull/1708
- https://github.com/apache/orc/pull/1733
- https://github.com/apache/orc/pull/1760
- https://github.com/apache/orc/pull/1743
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #45443 from dongjoon-hyun/SPARK-44115.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
dev/deps/spark-deps-hadoop-3-hive-2.3 | 7 ++++---
pom.xml | 17 ++++++++++++++++-
sql/core/pom.xml | 5 +++++
3 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 7e56e8914435..6b357b3e4b70 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -227,9 +227,10 @@ opencsv/2.3//opencsv-2.3.jar
opentracing-api/0.33.0//opentracing-api-0.33.0.jar
opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
opentracing-util/0.33.0//opentracing-util-0.33.0.jar
-orc-core/1.9.2/shaded-protobuf/orc-core-1.9.2-shaded-protobuf.jar
-orc-mapreduce/1.9.2/shaded-protobuf/orc-mapreduce-1.9.2-shaded-protobuf.jar
-orc-shims/1.9.2//orc-shims-1.9.2.jar
+orc-core/2.0.0/shaded-protobuf/orc-core-2.0.0-shaded-protobuf.jar
+orc-format/1.0.0/shaded-protobuf/orc-format-1.0.0-shaded-protobuf.jar
+orc-mapreduce/2.0.0/shaded-protobuf/orc-mapreduce-2.0.0-shaded-protobuf.jar
+orc-shims/2.0.0//orc-shims-2.0.0.jar
oro/2.0.8//oro-2.0.8.jar
osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index 9f1c9ed13f23..404f37be1b5a 100644
--- a/pom.xml
+++ b/pom.xml
@@ -141,7 +141,7 @@
<!-- After 10.17.1.0, the minimum required version is JDK19 -->
<derby.version>10.16.1.1</derby.version>
<parquet.version>1.13.1</parquet.version>
- <orc.version>1.9.2</orc.version>
+ <orc.version>2.0.0</orc.version>
<orc.classifier>shaded-protobuf</orc.classifier>
<jetty.version>11.0.20</jetty.version>
<jakartaservlet.version>5.0.0</jakartaservlet.version>
@@ -2593,6 +2593,13 @@
</exclusions>
</dependency>
+ <dependency>
+ <groupId>org.apache.orc</groupId>
+ <artifactId>orc-format</artifactId>
+ <version>1.0.0</version>
+ <classifier>${orc.classifier}</classifier>
+ <scope>${orc.deps.scope}</scope>
+ </dependency>
<dependency>
<groupId>org.apache.orc</groupId>
<artifactId>orc-core</artifactId>
@@ -2600,6 +2607,14 @@
<classifier>${orc.classifier}</classifier>
<scope>${orc.deps.scope}</scope>
<exclusions>
+ <exclusion>
+ <groupId>org.apache.orc</groupId>
+ <artifactId>orc-format</artifactId>
+ </exclusion>
+ <exclusion>
+ <groupId>com.aayushatharva.brotli4j</groupId>
+ <artifactId>brotli4j</artifactId>
+ </exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
diff --git a/sql/core/pom.xml b/sql/core/pom.xml
index 0ad9e0f690c7..05f906206e5e 100644
--- a/sql/core/pom.xml
+++ b/sql/core/pom.xml
@@ -93,6 +93,11 @@
<groupId>org.scala-lang.modules</groupId>
<artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
</dependency>
+ <dependency>
+ <groupId>org.apache.orc</groupId>
+ <artifactId>orc-format</artifactId>
+ <classifier>${orc.classifier}</classifier>
+ </dependency>
<dependency>
<groupId>org.apache.orc</groupId>
<artifactId>orc-core</artifactId>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]