[
https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16544398#comment-16544398
]
Hyukjin Kwon commented on SPARK-20202:
--------------------------------------
I think we are unclear about how we are going to deal with this and it's been
left open for a while ..
[~rxin], do you maybe have some preference in [my comment
above|https://issues.apache.org/jira/browse/SPARK-20202?focusedCommentId=16541034&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16541034]?
1. Go with Saisai's patch in HIVE-16391
- Publishing Hive 1.2.x could be easier but will give some overhead to Hive
side (e.g., maintaining the old branches, for example, backports).
- If I understood correctly, we have less problems (e.g., policy stuff) if we
go publishing Hive 1.2.x HIVE-16391
2. Target the upgrade with [~q79969786]'s fix, and add some fixes to our
current fork when there's strong reasons
- It is difficult but [~q79969786] made and completed an initial try about
the upgrade. It still need some further investigation (e.g., see
[SPARK-23710|https://issues.apache.org/jira/browse/SPARK-23710]) but the try
made the regression tests passed at least. She's willing to finish this.
- If we miss the Hive upgrade to 2.3.x in Spark 3.0.0, we should probably
target 4.0.0 with upper version of Hive, which I guess make this upgrade even
harder.
- Looks we implicitly agree upon this should be the final goal in the long
term.
See also [[email protected]]'s [comment
above|https://issues.apache.org/jira/browse/SPARK-20202?focusedCommentId=16500560&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16500560].
I am re-raising and giving some refreshes here because I personally see:
- Few facts arrived here since the JIRA was open. So, it looked to me it might
be better we consider the possible options again.
- Looks we are quite unclear on this about how we should get through this to me.
- To me, I am sure we need to share and feel in the same way for this JIRA and,
it looks I need some more supports from you guys before we go ahead because
it'd be a kind of not easily revertible changes.
- Branch-2.4 will be cut out soon and we will go for Spark 3.0.0 if I am not
mistaken.
I know there are many sensitive things going on here; however, please kindly
consider and give some inputs. I am sure we all feel that we should resolve
this.
Lastly, FWIW, I am doing this on my own rather individually if it matters to
anyone in any case.
> Remove references to org.spark-project.hive
> -------------------------------------------
>
> Key: SPARK-20202
> URL: https://issues.apache.org/jira/browse/SPARK-20202
> Project: Spark
> Issue Type: Bug
> Components: Build, SQL
> Affects Versions: 1.6.4, 2.0.3, 2.1.1
> Reporter: Owen O'Malley
> Priority: Major
>
> Spark can't continue to depend on their fork of Hive and must move to
> standard Hive versions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]