GitHub user dongjoon-hyun opened a pull request:
https://github.com/apache/spark/pull/21582
[SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
## What changes were proposed in this pull request?
This issue aims to upgrade Apache ORC library from 1.4.4 to 1.5.1 in order
to bring the following benefits into Apache Spark.
- [ORC-91](https://issues.apache.org/jira/browse/ORC-91) Support for
variable length blocks in HDFS (The current space wasted in ORC to padding is
known to be 5%.)
- [ORC-344](https://issues.apache.org/jira/browse/ORC-344) Support for
using Decimal64ColumnVector
In addition to that, Apache Hive 3.1.0 will use ORC 1.5.1
([HIVE-19669](https://issues.apache.org/jira/browse/HIVE-19465)). This will
improve the compatibility between Apache Spark and Apache Hive by sharing the
common library.
## How was this patch tested?
Pass the Jenkins with all existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongjoon-hyun/spark SPARK-24576
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21582.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21582
----
commit 60e461ee78e0b601e3f7bf7927730e0dabc234ef
Author: Dongjoon Hyun <dongjoon@...>
Date: 2018-06-13T04:19:56Z
[SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]