[jira] [Updated] (HUDI-4496) ORC fails w/ Spark 3.1

Raymond Xu (Jira) Thu, 27 Oct 2022 05:00:27 -0700


     [ 
https://issues.apache.org/jira/browse/HUDI-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Raymond Xu updated HUDI-4496:
-----------------------------
    Status: Patch Available  (was: In Progress)

> ORC fails w/ Spark 3.1
> ----------------------
>
>                 Key: HUDI-4496
>                 URL: https://issues.apache.org/jira/browse/HUDI-4496
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Alexey Kudinkin
>            Assignee: Alexey Kudinkin
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>
> After running TestHoodieSparkSqlWriter test for different Spark versions, 
> discovered that Orc version was incorrectly put as compile time dep on the 
> classpath, breaking Orc writing in Hudi in Spark 3.1:
> [https://github.com/apache/hudi/runs/7567326789?check_suite_focus=true]
>  
> *--- UPDATE ---*
> Unfortunately, it turned out that b/w Spark 2.4 and Spark 3.0, Spark 
> ratcheted the Orc dependency from "nohive" classifier (which was dependent on 
> its own cloned versions of the interfaces) onto a standard one (which depends 
> on Hive's interfaces), and that makes compatibility w/ Orc for both Spark 2 
> and Spark >= 3.x very complicated. 
> After extensive deliberations and gauging the interest for Orc support in 
> Spark 2.4.x branch of Hudi we took hard decision to drop Orc support in 
> Hudi's 0.13 release (for Spark 2.x) and instead fix it to be working in Spark 
> 3.x module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HUDI-4496) ORC fails w/ Spark 3.1

Reply via email to