[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209970#comment-17209970 ] Apache Spark commented on SPARK-20202: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/29973 > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1, 2.2.3, 2.3.4, 2.4.4, 3.0.0, 3.1.0 >Reporter: Owen O'Malley >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.1.0 > > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209967#comment-17209967 ] Apache Spark commented on SPARK-20202: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/29973 > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1, 2.2.3, 2.3.4, 2.4.4, 3.0.0, 3.1.0 >Reporter: Owen O'Malley >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.1.0 > > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209969#comment-17209969 ] Apache Spark commented on SPARK-20202: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/29973 > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1, 2.2.3, 2.3.4, 2.4.4, 3.0.0, 3.1.0 >Reporter: Owen O'Malley >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.1.0 > > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206870#comment-17206870 ] Apache Spark commented on SPARK-20202: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/29936 > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1, 2.2.3, 2.3.4, 2.4.4, 3.0.0, 3.1.0 >Reporter: Owen O'Malley >Assignee: Dongjoon Hyun >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17206869#comment-17206869 ] Apache Spark commented on SPARK-20202: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/29936 > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1, 2.2.3, 2.3.4, 2.4.4, 3.0.0, 3.1.0 >Reporter: Owen O'Malley >Assignee: Dongjoon Hyun >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977904#comment-16977904 ] Dongjoon Hyun commented on SPARK-20202: --- Hi, All. I set the target version to `3.1.0`. Please join the discussion if you have any concerns. - https://lists.apache.org/thread.html/eca4e55c717f35f41c029e227fa9be0a7ee2c8a6f378fcce8f9fd4ff@%3Cdev.spark.apache.org%3E > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1, 2.2.3, 2.3.4, 2.4.4, 3.0.0 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769767#comment-16769767 ] t oo commented on SPARK-20202: -- gentle ping > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652440#comment-16652440 ] t oo commented on SPARK-20202: -- bump > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544398#comment-16544398 ] Hyukjin Kwon commented on SPARK-20202: -- I think we are unclear about how we are going to deal with this and it's been left open for a while .. [~rxin], do you maybe have some preference in [my comment above|https://issues.apache.org/jira/browse/SPARK-20202?focusedCommentId=16541034=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16541034]? 1. Go with Saisai's patch in HIVE-16391 - Publishing Hive 1.2.x could be easier but will give some overhead to Hive side (e.g., maintaining the old branches, for example, backports). - If I understood correctly, we have less problems (e.g., policy stuff) if we go publishing Hive 1.2.x HIVE-16391 2. Target the upgrade with [~q79969786]'s fix, and add some fixes to our current fork when there's strong reasons - It is difficult but [~q79969786] made and completed an initial try about the upgrade. It still need some further investigation (e.g., see [SPARK-23710|https://issues.apache.org/jira/browse/SPARK-23710]) but the try made the regression tests passed at least. She's willing to finish this. - If we miss the Hive upgrade to 2.3.x in Spark 3.0.0, we should probably target 4.0.0 with upper version of Hive, which I guess make this upgrade even harder. - Looks we implicitly agree upon this should be the final goal in the long term. See also [~ste...@apache.org]'s [comment above|https://issues.apache.org/jira/browse/SPARK-20202?focusedCommentId=16500560=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16500560]. I am re-raising and giving some refreshes here because I personally see: - Few facts arrived here since the JIRA was open. So, it looked to me it might be better we consider the possible options again. - Looks we are quite unclear on this about how we should get through this to me. - To me, I am sure we need to share and feel in the same way for this JIRA and, it looks I need some more supports from you guys before we go ahead because it'd be a kind of not easily revertible changes. - Branch-2.4 will be cut out soon and we will go for Spark 3.0.0 if I am not mistaken. I know there are many sensitive things going on here; however, please kindly consider and give some inputs. I am sure we all feel that we should resolve this. Lastly, FWIW, I am doing this on my own rather individually if it matters to anyone in any case. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541191#comment-16541191 ] Felix Cheung commented on SPARK-20202: -- How like will there be a hive release? HIVE-16391 is still open? Stay with hive 1.2 will slowly become a big problem for us within a few months... > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541034#comment-16541034 ] Hyukjin Kwon commented on SPARK-20202: -- I am asking this to set the goal for this JIRA as of the current status and make a progress on this. I left some comments here because to me it looked [~q79969786]'s try is kind of a new fact arrived here to be considered. If publishing Hive is still preferred to get through here for any reason, I will help go with Saisai's patch in HIVE-16391. If keeping the fork and upgrade could be set as a goal for now, I will try to help go with Yumming's way and make a fix to the fork. Which one do you prefer? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540483#comment-16540483 ] Hyukjin Kwon commented on SPARK-20202: -- [~rxin], there was an initial try above already though which at least made the regression tests we wrote so far passed. I talked with [~q79969786] before and she's willing to finish this. For this, I need more supports from you and other guys to go this way .. I get your point too on the other hand. So, do you we should rather not explicitly target it since it's pretty difficult and we should better let Hive publish 1.2.x first rather then keeping the fork since it's unclear if we make it in 3.0.0? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540468#comment-16540468 ] Reynold Xin commented on SPARK-20202: - Yea you can try and see how difficult it is. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540467#comment-16540467 ] Hyukjin Kwon commented on SPARK-20202: -- I was thinking we target it for 3.0.0 (otherwise 4.0.0 might make sense ... ). It might be a lot of work indeed but I believe this is what we should do as a final goal which we should do anyway sometime ... Untill then, I wanted to propose to keep the fork as a temporary solution until 3.0.0 and we target to upgrade it to 2.3.x in 3.0.0 as the goal and target version ... Branch-2.4 will be cut out soon and I think we would go for 3.0.0 for the next release if I am not mistaken. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540454#comment-16540454 ] Reynold Xin commented on SPARK-20202: - If you want to try and put together a PR that actually does it, that could work too. But note that it's a lot of work to upgrade execution Hive. Probably 10X more work than Hive publishing the exec jar. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539473#comment-16539473 ] Hyukjin Kwon commented on SPARK-20202: -- Hey [~owen.omalley] and [~rxin], I know I see many sensitive things for example the policy stuff frankly; however, this one needs some input from you guys before proceeding further ... > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531706#comment-16531706 ] Hyukjin Kwon commented on SPARK-20202: -- kindly ping [~owen.omalley] and [~rxin]. I would like to make a progress further on this since it's blocked for a while but it's pretty important to make up for this affair. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527291#comment-16527291 ] Hyukjin Kwon commented on SPARK-20202: -- Would you guys please give some thought on this when you guys are available? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521823#comment-16521823 ] Hyukjin Kwon commented on SPARK-20202: -- [~owen.omalley] and [~rxin], what do you think about the suggestion above? I tried to check all other contexts hard at all my best and ^ was my current conclusion to get through this issue mostly smoothly and easily. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521203#comment-16521203 ] Hyukjin Kwon commented on SPARK-20202: -- Hi all, what do you guys think about eplacing it to Hive 2.3.x in the near future (like Spark 3.0.0) given SPARK-23710, and keeping the fork for now? Looks [~q79969786] completed the initial try at SPARK-23710 and now it sounds pretty much feasible as an option now although it sounds there are still some investigations; however, I believe that we can focus on getting through if we have the explicit plan here. I think we are mostly all positive on this option as a final goal anyway but I felt like we need to make sure on this. If the above can be set as the goal for this JIRA to get rid of the fork completely, \*I personally think\* Hive side also can focus on landing other fixes to the more resent versions without diverting the efforts to maintain an old branch. Until then, I think we could probably consider keeping the fork for now and landing some minor fixes if there're some strong reasons for it. For example, Hadoop 3 support is blocked by one liner fix in the fork. \*I personally think\* it is the easiest way to land this fix into the fork. I believe this is pretty reasonable. What do you guys think about this? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501225#comment-16501225 ] Saisai Shao commented on SPARK-20202: - OK, for the 1st, I've already started working on it locally. Looks like it is not a big change, only some POM changes are enough, I will submit a patch to Hive community. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500560#comment-16500560 ] Steve Loughran commented on SPARK-20202: I think you could split things into two # a modified hive 1.2.1.x for hadoop 3, with a new package name in the maven builds. (joy, profiles!) and some work with the hive team to get this officially published by them. Strength: easy for people to backport into shipping 2.2, 2.3 builds just by changing the POM # the bigger move to Hive 2. This will be the best for future, but is bound to have more surprises. There's even the possibility that the hive team might have to make some changes too, which isn't impossible if the timelines line up. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16499707#comment-16499707 ] Saisai Shao commented on SPARK-20202: - What is our plan to to fix this issue, are we going to use new Hive version, or we are still stick to 1.2? If we're still stick to 1.2, [~ste...@apache.org] and I will take this issue and make the ball rolling in Hive community. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498213#comment-16498213 ] Felix Cheung commented on SPARK-20202: -- Prefer newer Hive also > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386125#comment-16386125 ] Yuming Wang commented on SPARK-20202: - How about upgrade Hive directly to 2.3.2. In fact, I've completed the initial work and have been running for a few days. [https://github.com/apache/spark/pull/20659] > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15966201#comment-15966201 ] holdenk commented on SPARK-20202: - Oh right, sorry I was misreading the intent of Affects Version/s. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965220#comment-15965220 ] Reynold Xin commented on SPARK-20202: - There are no currently targeted version, are there? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965218#comment-15965218 ] holdenk commented on SPARK-20202: - Would it possible make sense to untarget this from the maintenance releases (1.6.X, 2.0.X, 2.1.X) and instead focus on the future versions? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962903#comment-15962903 ] Steve Loughran commented on SPARK-20202: One thing I do recall as trouble here was that ivy resolution was different from mvns, and fixing up all the transitives was a troublespot. Patches here need to be tested against SBT and maven —as jenkins only does SBT, the mvn builds will have to be manual. I don't remember which specific dependency was the problem. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957843#comment-15957843 ] Reynold Xin commented on SPARK-20202: - I've created a ticket on the Hive side to publish 1.2.x: https://issues.apache.org/jira/browse/HIVE-16391 Until that is resolved, I also wonder if there are other things we should do. For example, vote on the current fork to rectify it? > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957811#comment-15957811 ] Reynold Xin commented on SPARK-20202: - Yes this is really important. The proper way to do this is to publish a proper version of Hive with the right dependency declared (rather than including all the dependencies in a uber jar). Looks like there are broad support to do this. I'm going to create a JIRA ticket on Hive and add a dependency on this. This ticket will depend on that. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957377#comment-15957377 ] Ryan Blue commented on SPARK-20202: --- +1 for a release of the Spark fork from the Hive community. While the reasons for the fork appear to be fixed in the latest version, there's a lot of work to do to get Spark on a newer Hive version. And for patch releases like 2.0.3 and 2.1.1, I don't think updating Hive is an option. I'm also all for getting master on a real Hive release. A release of alternate Hive binaries was inappropriate. I think that if a third-party organization had done the same, it would be entirely reasonable to treat it as a trademark violation and ask them to stop. http://www.apache.org/foundation/marks/faq/#products > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Blocker > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956879#comment-15956879 ] Steve Loughran commented on SPARK-20202: # the ugliness need to inset the spark thrift stuff under the hive thrift stuff is obsolete, can be cut entirely. # with the shading of kryo not needed, an unshaded hive *may* work. I forget which troublespots there were last time, probably the usual suspects: jackson, guava, etc. # Hive 1.2.x refuses to work with Hadoop 3; it considers that an unsupported version. For basic client-side testing, you can build Hadoop 3 with a fake version (e..g {{mvn install -DskipShade -Ddeclared.hadoop.version=2.11}}, but as hadoop version is something which NN/DNs care about, not something that's really going to work in real systems. Presumably later hive versions will address that. If hive take over ownership of the spark 1.2.1-spark branch, this could be done first simply by pulling the spark branch into the Hive repo as a branch, defining the artifact naming properly and releasing it. If that is done, before any release of that 1.2.x branch is done, there's a couple of outstanding PRs to pull in (groovy version for security reasons, ... ).. A quick import & re-release would be the fast way to get this out as an asf-approved binary > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Blocker > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955188#comment-15955188 ] Sean Owen commented on SPARK-20202: --- [~marmbrus] as release manager of the moment, I suggest we actually formally vote on release the org.spark-project.hive artifact, as I'm not clear we ever did formally. That much seems like a must-have. I don't know that it requires re-releasing the artifacts, but at least having a meaningful review of what it is, and agreeing (or not) that it's what the PMC wants to release, would I believe resolve doubts about the legitimacy of that artifact. There's still more to do no doubt, to get rid of the fork. This might include seeing if Hive 1.2.x can provide an un-uberized artifact. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Blocker > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953489#comment-15953489 ] Sean Owen commented on SPARK-20202: --- Alrighty, you can leave the status for now, but generally committers set Blocker. I'm not entirely clear this blocks a release, not yet. You're absolutely right, but, the hive fork with binaries and source is part of this project. At least, that's the idea. For example, this is notionally voted on and released with each Spark release, but the binary/source of this fork project isn't separately, explicitly, voted on and separately released. I think that should occur for avoidance of doubt, that this is a blessed artifact of the Spark project. Would this answer your process and policy concerns about the release? It's not pretty but I think that's within the law. Of course, it's no answer in the long term. The goal is to not have to use the fork at all. If Hive packaging changes are already in place to make it unnecessary, great (is that all there is to it, everyone?) I don't know if that presents a solution for earlier versions of Hive. This fork thing may persist in existing branches, but it has to at least be released and used in a proper way. This may need fixes right now. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Blocker > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953315#comment-15953315 ] Owen O'Malley commented on SPARK-20202: --- I should also say here that the Hive community is willing to help. We are in the process of rolling releases so if Spark needs a change, we can work together to get this done. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Critical > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953298#comment-15953298 ] Owen O'Malley commented on SPARK-20202: --- As an Apache member, the Spark project can't release binary artifacts that aren't made from its Apache code base. So either, the Spark project needs to use Hive's release artifacts or it formally fork Hive and move the fork into its git repository at Apache and rename it away from org.apache.hive to org.apache.spark. The current path is not allowed. Hive is in the middle of rolling releases and thus this is a good time to make requests. The old uber jar (hive-exec) is already released separately with the classifier "core." It looks like we are using the same protobuf (2.5.0) and kryo (3.0.3) versions. > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Critical > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org