[ 
https://issues.apache.org/jira/browse/HUDI-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated HUDI-4496:
----------------------------------
    Description: 
After running TestHoodieSparkSqlWriter test for different Spark versions, 
discovered that Orc version was incorrectly put as compile time dep on the 
classpath, breaking Orc writing in Hudi in Spark 3.1:

[https://github.com/apache/hudi/runs/7567326789?check_suite_focus=true]

 

*--- UPDATE ---*

Unfortunately, it turned out that b/w Spark 2.4 and Spark 3.0, Spark ratcheted 
the Orc dependency from "nohive" classifier (which was dependent on its own 
cloned versions of the interfaces) onto a standard one (which depends on Hive's 
interfaces), and that makes compatibility w/ Orc for both Spark 2 and Spark >= 
3.x very complicated. 

After extensive deliberations and gauging the interest for Orc support in Spark 
2.4.x branch of Hudi we took hard decision to drop Orc support in Hudi's 0.13 
release (for Spark 2.x) and instead fix it to be working in Spark 3.x module.

  was:
After running TestHoodieSparkSqlWriter test for different Spark versions, 
discovered that Orc version was incorrectly put as compile time dep on the 
classpath, breaking Orc writing in Hudi in Spark 3.1:

https://github.com/apache/hudi/runs/7567326789?check_suite_focus=true


> ORC fails w/ Spark 3.1
> ----------------------
>
>                 Key: HUDI-4496
>                 URL: https://issues.apache.org/jira/browse/HUDI-4496
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>            Reporter: Alexey Kudinkin
>            Assignee: Alexey Kudinkin
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>
> After running TestHoodieSparkSqlWriter test for different Spark versions, 
> discovered that Orc version was incorrectly put as compile time dep on the 
> classpath, breaking Orc writing in Hudi in Spark 3.1:
> [https://github.com/apache/hudi/runs/7567326789?check_suite_focus=true]
>  
> *--- UPDATE ---*
> Unfortunately, it turned out that b/w Spark 2.4 and Spark 3.0, Spark 
> ratcheted the Orc dependency from "nohive" classifier (which was dependent on 
> its own cloned versions of the interfaces) onto a standard one (which depends 
> on Hive's interfaces), and that makes compatibility w/ Orc for both Spark 2 
> and Spark >= 3.x very complicated. 
> After extensive deliberations and gauging the interest for Orc support in 
> Spark 2.4.x branch of Hudi we took hard decision to drop Orc support in 
> Hudi's 0.13 release (for Spark 2.x) and instead fix it to be working in Spark 
> 3.x module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to