We found that ORC table created by Spark 2.4 failed to be read by Hive 2.1.1.





spark-sql -e 'CREATE TABLE tmp.orcTable2 USING orc  AS SELECT * FROM 
tmp.orcTable1 limit 10;'

hive -e 'select * from tmp.orcTable2'



The ERROR messages by Hive:



Failed with exception java.io.IOException:java.lang.RuntimeException: ORC split 
generation failed with exception: java.lang.ArrayIndexOutOfBoundsException: 6



And Spark 2.3.2 (or below) works fine.



I think we should git revert [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.2 
by Dongjoon Hyun





---- On Tue, 12 Feb 2019 16:56:09 +0800 Dongjin Lee <dong...@apache.org> wrote 
----




> SPARK-23539 is a non-trivial improvement, so probably would not be 
> back-ported to 2.4.x.



Got it. It seems reasonable.



Committers:



Please don't omit SPARK-23539 from 2.5.0. Kafka community needs this feature.



Thanks,

Dongjin





On Tue, Feb 12, 2019 at 1:50 PM Takeshi Yamamuro <mailto:linguin....@gmail.com> 
wrote:








-- 

Dongjin Lee




A hitchhiker in the mathematical world.




github:http://goog_969573159/https://github.com/dongjinleekr

linkedin: https://kr.linkedin.com/in/dongjinleekr


speakerdeck: https://speakerdeck.com/dongjin










+1, too.

branch-2.4 accumulates too many commits..:

https://github.com/apache/spark/compare/0a4c03f7d084f1d2aa48673b99f3b9496893ce8d...af3c7111efd22907976fc8bbd7810fe3cfd92092





On Tue, Feb 12, 2019 at 12:36 PM Dongjoon Hyun <mailto:dongj...@apache.org> 
wrote:

Thank you, DB.

 

 +1, Yes. It's time for preparing 2.4.1 release.

 

 Bests,

 Dongjoon.

 

 On 2019/02/12 03:16:05, Sean Owen <mailto:sro...@gmail.com> wrote: 

 > I support a 2.4.1 release now, yes.

 > 

 > SPARK-23539 is a non-trivial improvement, so probably would not be

 > back-ported to 2.4.x.SPARK-26154 does look like a bug whose fix could

 > be back-ported, but that's a big change. I wouldn't hold up 2.4.1 for

 > it, but it could go in if otherwise ready.

 > 

 > 

 > On Mon, Feb 11, 2019 at 5:20 PM Dongjin Lee <mailto:dong...@apache.org> 
 > wrote:

 > >

 > > Hi DB,

 > >

 > > Could you add SPARK-23539[^1] into 2.4.1? I opened the PR[^2] a little bit 
 > > ago, but it has not included in 2.3.0 nor get enough review.

 > >

 > > Thanks,

 > > Dongjin

 > >

 > > [^1]: https://issues.apache.org/jira/browse/SPARK-23539

 > > [^2]: https://github.com/apache/spark/pull/22282

 > >

 > > On Tue, Feb 12, 2019 at 6:28 AM Jungtaek Lim <mailto:kabh...@gmail.com> 
 > > wrote:

 > >>

 > >> Given SPARK-26154 [1] is a correctness issue and PR [2] is submitted, I 
 > >> hope it can be reviewed and included within Spark 2.4.1 - otherwise it 
 > >> will be a long-live correctness issue.

 > >>

 > >> Thanks,

 > >> Jungtaek Lim (HeartSaVioR)

 > >>

 > >> 1. https://issues.apache.org/jira/browse/SPARK-26154

 > >> 2. https://github.com/apache/spark/pull/23634

 > >>

 > >>

 > >> 2019년 2월 12일 (화) 오전 6:17, DB Tsai <mailto:d_t...@apple.com.invalid>님이 작성:

 > >>>

 > >>> Hello all,

 > >>>

 > >>> I am preparing to cut a new Apache 2.4.1 release as there are many bugs 
 > >>> and correctness issues fixed in branch-2.4.

 > >>>

 > >>> The list of addressed issues are 
 > >>> https://issues.apache.org/jira/browse/SPARK-26583?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.4.1%20order%20by%20updated%20DESC

 > >>>

 > >>> Let me know if you have any concern or any PR you would like to get in.

 > >>>

 > >>> Thanks!

 > >>>

 > >>> ---------------------------------------------------------------------

 > >>> To unsubscribe e-mail: mailto:dev-unsubscr...@spark.apache.org

 > >>>

 > >

 > >

 > > --

 > > Dongjin Lee

 > >

 > > A hitchhiker in the mathematical world.

 > >

 > > github: http://github.com/dongjinleekr

 > > linkedin: http://kr.linkedin.com/in/dongjinleekr

 > > speakerdeck: http://speakerdeck.com/dongjin

 > 

 > ---------------------------------------------------------------------

 > To unsubscribe e-mail: mailto:dev-unsubscr...@spark.apache.org

 > 

 > 

 

 ---------------------------------------------------------------------

 To unsubscribe e-mail: mailto:dev-unsubscr...@spark.apache.org

 







-- 

---

Takeshi Yamamuro

Reply via email to