Re: Time to Remove Hive-on-Spark
+1 from my side too. I have created PR against the current branch. Still needs some work, and as many reviews as possible, because it is quite big, and I might made some mistakes https://issues.apache.org/jira/browse/HIVE-26134 https://github.com/apache/hive/pull/3201 Thanks, Peter On Thu, 10 Feb 2022 at 17:43, Zoltan Haindrich wrote: > Hey, > > I think there is no real interest in this feature; we don't have > users/contributors backing it - last development was around 2018 October; > there were ~2 bugfix commits ever > since that...we should stop carrying dead weight...another 2 weeks went by > since Stamatis have reminded us that after 1.5 years(!) nothing have > changed. > > +1 on removing it > > cheers, > Zoltan > > you may inspect some of the recent changes with: > git log -c `find . -type f -path '**/spark/**'|grep -v xml|grep -v > properties|grep -v q.out` > > > On 1/28/22 2:32 PM, Stamatis Zampetakis wrote: > > Hi team, > > > > Almost one year has passed since the last exchange in this discussion and > > if I am not wrong there has been no effort to revive Hive-on-Spark. To be > > more precise, I don't think I have seen any Spark related JIRA for quite > > some time now and although I don't want to rush into conclusions, there > > does not seem to be any community member involved in maintaining or > adding > > new features in this part of the code. > > > > Keeping dead code in the repository does not do any good to the project > and > > puts a non-negligible burden to future maintainers. > > > > Clearly, we cannot make a new Hive release where a major feature is > > completely untested so either someone commits to re-enable/fix the > > respective tests soon or we move forward the work started by David and > drop > > support for Hive-on-Spark. > > > > I would like to ask the community if there is anyone who can take up this > > maintenance task and enable/fix Spark related tests in the next month or > so? > > > > Best, > > Stamatis > > > > On Sat, Feb 27, 2021 at 4:17 AM Edward Capriolo > > wrote: > > > >> I do not know how it works for most of the world. But in cloudera where > the > >> TEZ options were never popular hive-on-spark represents a solid way to > get > >> things done for small datasets lower latency. > >> > >> As for the spark adoption. You know a while ago I came up with some > ways to > >> make hive more spark like. One of them was a found a way to make > "compile" > >> a hive keyword so folks could build UDFs on the fly. It was such an > >> uphil climb. Folks found a way to make it disabled by default for > security. > >> Then later when things moved from CLI to beeline it was like the ONLY > thing > >> that I found not ported. Like it was extremely frustrating. > >> > >> > >> > >> > >> > >> > >> On Mon, Jul 27, 2020 at 3:19 PM David wrote: > >> > >>> Hello Xuefu, > >>> > >>> I am not part of the Cloudera Hive product team, though I volunteer to > >>> work on small projects from time to time. Perhaps someone from that > team > >>> can chime in with some of their thoughts, but personally, I think that > in > >>> the long run, there will be more of a merge between Hive-on-Spark and > >> other > >>> Spark-native offerings. I'm not sure what the differentiation will be > >>> going forward. With that said, are there any developers on this > mailing > >>> list who are willing to take on the maintenance effort of keeping HoS > >>> moving forward? > >>> > >>> http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ > >>> > >>> > >> > https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html > >>> > >>> > >>> Thanks. > >>> > >>> On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang wrote: > >>> > Previous reasoning seemed to suggest a lack of user adoption. Now we > >> are > concerned about ongoing maintenance effort. Both are valid > >>> considerations. > However, I think we should have ways to find out the answers. > >> Therefore, > >>> I > suggest the following be carried out: > > 1. Send out the proposal (removing Hive on Spark) to users including > u...@hive.apache.org and get their feedback. > 2. Ask if any developers on this mailing list are willing to take on > >> the > maintenance effort. > > I'm concerned about user impact because I can still see issues being > reported on HoS from time to time. I'm more concerned about the future > >> of > Hive if we narrow Hive neutrality on execution engines, which will > >>> possibly > force more Hive users to migrate to other alternatives such as Spark > >> SQL, > which is already eroding Hive's user base. > > Being open and neutral used to be Hive's most admired strengths. > > Thanks, > Xuefu > > > On Wed, Jul 22, 2020 at 8:46 AM Alan Gates > >> wrote: > > > An important point here is I don't believe David is proposing to > >> remove > > Hive on Spark from th
Re: Time to Remove Hive-on-Spark
Hey, I think there is no real interest in this feature; we don't have users/contributors backing it - last development was around 2018 October; there were ~2 bugfix commits ever since that...we should stop carrying dead weight...another 2 weeks went by since Stamatis have reminded us that after 1.5 years(!) nothing have changed. +1 on removing it cheers, Zoltan you may inspect some of the recent changes with: git log -c `find . -type f -path '**/spark/**'|grep -v xml|grep -v properties|grep -v q.out` On 1/28/22 2:32 PM, Stamatis Zampetakis wrote: Hi team, Almost one year has passed since the last exchange in this discussion and if I am not wrong there has been no effort to revive Hive-on-Spark. To be more precise, I don't think I have seen any Spark related JIRA for quite some time now and although I don't want to rush into conclusions, there does not seem to be any community member involved in maintaining or adding new features in this part of the code. Keeping dead code in the repository does not do any good to the project and puts a non-negligible burden to future maintainers. Clearly, we cannot make a new Hive release where a major feature is completely untested so either someone commits to re-enable/fix the respective tests soon or we move forward the work started by David and drop support for Hive-on-Spark. I would like to ask the community if there is anyone who can take up this maintenance task and enable/fix Spark related tests in the next month or so? Best, Stamatis On Sat, Feb 27, 2021 at 4:17 AM Edward Capriolo wrote: I do not know how it works for most of the world. But in cloudera where the TEZ options were never popular hive-on-spark represents a solid way to get things done for small datasets lower latency. As for the spark adoption. You know a while ago I came up with some ways to make hive more spark like. One of them was a found a way to make "compile" a hive keyword so folks could build UDFs on the fly. It was such an uphil climb. Folks found a way to make it disabled by default for security. Then later when things moved from CLI to beeline it was like the ONLY thing that I found not ported. Like it was extremely frustrating. On Mon, Jul 27, 2020 at 3:19 PM David wrote: Hello Xuefu, I am not part of the Cloudera Hive product team, though I volunteer to work on small projects from time to time. Perhaps someone from that team can chime in with some of their thoughts, but personally, I think that in the long run, there will be more of a merge between Hive-on-Spark and other Spark-native offerings. I'm not sure what the differentiation will be going forward. With that said, are there any developers on this mailing list who are willing to take on the maintenance effort of keeping HoS moving forward? http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html Thanks. On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang wrote: Previous reasoning seemed to suggest a lack of user adoption. Now we are concerned about ongoing maintenance effort. Both are valid considerations. However, I think we should have ways to find out the answers. Therefore, I suggest the following be carried out: 1. Send out the proposal (removing Hive on Spark) to users including u...@hive.apache.org and get their feedback. 2. Ask if any developers on this mailing list are willing to take on the maintenance effort. I'm concerned about user impact because I can still see issues being reported on HoS from time to time. I'm more concerned about the future of Hive if we narrow Hive neutrality on execution engines, which will possibly force more Hive users to migrate to other alternatives such as Spark SQL, which is already eroding Hive's user base. Being open and neutral used to be Hive's most admired strengths. Thanks, Xuefu On Wed, Jul 22, 2020 at 8:46 AM Alan Gates wrote: An important point here is I don't believe David is proposing to remove Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing to support it in existing 2 and 3 lines makes sense, but since no one has maintained it on trunk for some time and it does not work with many of the newer features it should be removed from trunk. Alan. On Tue, Jul 21, 2020 at 4:10 PM Chao Sun wrote: Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a very large scale in production right now and I don't think we have any plan to change it soon. On Tue, Jul 21, 2020 at 11:28 AM David wrote: Hello, Thanks for the feedback. Just a quick recap: I did propose this @dev and I received unanimous +1's from the community. After a couple months, I created the PR. Certainly open to discussion, but there hasn't been any discussion thus far because there have been no objections until this point. HoS has low adoption, heavy technical debt, and the manner i
Re: Time to Remove Hive-on-Spark
Hi team, Almost one year has passed since the last exchange in this discussion and if I am not wrong there has been no effort to revive Hive-on-Spark. To be more precise, I don't think I have seen any Spark related JIRA for quite some time now and although I don't want to rush into conclusions, there does not seem to be any community member involved in maintaining or adding new features in this part of the code. Keeping dead code in the repository does not do any good to the project and puts a non-negligible burden to future maintainers. Clearly, we cannot make a new Hive release where a major feature is completely untested so either someone commits to re-enable/fix the respective tests soon or we move forward the work started by David and drop support for Hive-on-Spark. I would like to ask the community if there is anyone who can take up this maintenance task and enable/fix Spark related tests in the next month or so? Best, Stamatis On Sat, Feb 27, 2021 at 4:17 AM Edward Capriolo wrote: > I do not know how it works for most of the world. But in cloudera where the > TEZ options were never popular hive-on-spark represents a solid way to get > things done for small datasets lower latency. > > As for the spark adoption. You know a while ago I came up with some ways to > make hive more spark like. One of them was a found a way to make "compile" > a hive keyword so folks could build UDFs on the fly. It was such an > uphil climb. Folks found a way to make it disabled by default for security. > Then later when things moved from CLI to beeline it was like the ONLY thing > that I found not ported. Like it was extremely frustrating. > > > > > > > On Mon, Jul 27, 2020 at 3:19 PM David wrote: > > > Hello Xuefu, > > > > I am not part of the Cloudera Hive product team, though I volunteer to > > work on small projects from time to time. Perhaps someone from that team > > can chime in with some of their thoughts, but personally, I think that in > > the long run, there will be more of a merge between Hive-on-Spark and > other > > Spark-native offerings. I'm not sure what the differentiation will be > > going forward. With that said, are there any developers on this mailing > > list who are willing to take on the maintenance effort of keeping HoS > > moving forward? > > > > http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ > > > > > https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html > > > > > > Thanks. > > > > On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang wrote: > > > > > Previous reasoning seemed to suggest a lack of user adoption. Now we > are > > > concerned about ongoing maintenance effort. Both are valid > > considerations. > > > However, I think we should have ways to find out the answers. > Therefore, > > I > > > suggest the following be carried out: > > > > > > 1. Send out the proposal (removing Hive on Spark) to users including > > > u...@hive.apache.org and get their feedback. > > > 2. Ask if any developers on this mailing list are willing to take on > the > > > maintenance effort. > > > > > > I'm concerned about user impact because I can still see issues being > > > reported on HoS from time to time. I'm more concerned about the future > of > > > Hive if we narrow Hive neutrality on execution engines, which will > > possibly > > > force more Hive users to migrate to other alternatives such as Spark > SQL, > > > which is already eroding Hive's user base. > > > > > > Being open and neutral used to be Hive's most admired strengths. > > > > > > Thanks, > > > Xuefu > > > > > > > > > On Wed, Jul 22, 2020 at 8:46 AM Alan Gates > wrote: > > > > > > > An important point here is I don't believe David is proposing to > remove > > > > Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing > > to > > > > support it in existing 2 and 3 lines makes sense, but since no one > has > > > > maintained it on trunk for some time and it does not work with many > of > > > the > > > > newer features it should be removed from trunk. > > > > > > > > Alan. > > > > > > > > On Tue, Jul 21, 2020 at 4:10 PM Chao Sun wrote: > > > > > > > > > Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a > > > very > > > > > large scale in production right now and I don't think we have any > > plan > > > to > > > > > change it soon. > > > > > > > > > > > > > > > > > > > > On Tue, Jul 21, 2020 at 11:28 AM David wrote: > > > > > > > > > > > Hello, > > > > > > > > > > > > Thanks for the feedback. > > > > > > > > > > > > Just a quick recap: I did propose this @dev and I received > > unanimous > > > > +1's > > > > > > from the community. After a couple months, I created the PR. > > > > > > > > > > > > Certainly open to discussion, but there hasn't been any > discussion > > > thus > > > > > far > > > > > > because there have been no objections until this point. > > > > > > > > > > > > HoS has low adoption, heavy technical debt, and the manner in >
Re: Time to Remove Hive-on-Spark
I do not know how it works for most of the world. But in cloudera where the TEZ options were never popular hive-on-spark represents a solid way to get things done for small datasets lower latency. As for the spark adoption. You know a while ago I came up with some ways to make hive more spark like. One of them was a found a way to make "compile" a hive keyword so folks could build UDFs on the fly. It was such an uphil climb. Folks found a way to make it disabled by default for security. Then later when things moved from CLI to beeline it was like the ONLY thing that I found not ported. Like it was extremely frustrating. On Mon, Jul 27, 2020 at 3:19 PM David wrote: > Hello Xuefu, > > I am not part of the Cloudera Hive product team, though I volunteer to > work on small projects from time to time. Perhaps someone from that team > can chime in with some of their thoughts, but personally, I think that in > the long run, there will be more of a merge between Hive-on-Spark and other > Spark-native offerings. I'm not sure what the differentiation will be > going forward. With that said, are there any developers on this mailing > list who are willing to take on the maintenance effort of keeping HoS > moving forward? > > http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ > > https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html > > > Thanks. > > On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang wrote: > > > Previous reasoning seemed to suggest a lack of user adoption. Now we are > > concerned about ongoing maintenance effort. Both are valid > considerations. > > However, I think we should have ways to find out the answers. Therefore, > I > > suggest the following be carried out: > > > > 1. Send out the proposal (removing Hive on Spark) to users including > > u...@hive.apache.org and get their feedback. > > 2. Ask if any developers on this mailing list are willing to take on the > > maintenance effort. > > > > I'm concerned about user impact because I can still see issues being > > reported on HoS from time to time. I'm more concerned about the future of > > Hive if we narrow Hive neutrality on execution engines, which will > possibly > > force more Hive users to migrate to other alternatives such as Spark SQL, > > which is already eroding Hive's user base. > > > > Being open and neutral used to be Hive's most admired strengths. > > > > Thanks, > > Xuefu > > > > > > On Wed, Jul 22, 2020 at 8:46 AM Alan Gates wrote: > > > > > An important point here is I don't believe David is proposing to remove > > > Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing > to > > > support it in existing 2 and 3 lines makes sense, but since no one has > > > maintained it on trunk for some time and it does not work with many of > > the > > > newer features it should be removed from trunk. > > > > > > Alan. > > > > > > On Tue, Jul 21, 2020 at 4:10 PM Chao Sun wrote: > > > > > > > Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a > > very > > > > large scale in production right now and I don't think we have any > plan > > to > > > > change it soon. > > > > > > > > > > > > > > > > On Tue, Jul 21, 2020 at 11:28 AM David wrote: > > > > > > > > > Hello, > > > > > > > > > > Thanks for the feedback. > > > > > > > > > > Just a quick recap: I did propose this @dev and I received > unanimous > > > +1's > > > > > from the community. After a couple months, I created the PR. > > > > > > > > > > Certainly open to discussion, but there hasn't been any discussion > > thus > > > > far > > > > > because there have been no objections until this point. > > > > > > > > > > HoS has low adoption, heavy technical debt, and the manner in which > > its > > > > > build process is setup is impeding some other work that is not even > > > > related > > > > > to HoS. > > > > > > > > > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan > would > > be > > > > to > > > > > use Tez moving forward. > > > > > > > > > > My point about the vendor's move to Tez is that HoS adoption is > very > > > low, > > > > > it's only going lower, and while I don't know the specifics of it, > > > there > > > > > must be some migration plan in place there (i.e., it must be > possible > > > to > > > > do > > > > > it already). > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang > > wrote: > > > > > > > > > > > Hi David, > > > > > > > > > > > > While a vendor may not support a component in an open source > > project, > > > > > > removing it or not is a decision by and for the community. I > > > certainly > > > > > > understand that the vendor you mentioned has contributed a great > > deal > > > > > > (including my personal effort while working there), it's not up > to > > > the > > > > > > vendor to make a call like what is proposed here. > > > > > > > > > > > > As a community, we should have gone through a t
Re: Time to Remove Hive-on-Spark
Hello Xuefu, I am not part of the Cloudera Hive product team, though I volunteer to work on small projects from time to time. Perhaps someone from that team can chime in with some of their thoughts, but personally, I think that in the long run, there will be more of a merge between Hive-on-Spark and other Spark-native offerings. I'm not sure what the differentiation will be going forward. With that said, are there any developers on this mailing list who are willing to take on the maintenance effort of keeping HoS moving forward? http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html Thanks. On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang wrote: > Previous reasoning seemed to suggest a lack of user adoption. Now we are > concerned about ongoing maintenance effort. Both are valid considerations. > However, I think we should have ways to find out the answers. Therefore, I > suggest the following be carried out: > > 1. Send out the proposal (removing Hive on Spark) to users including > u...@hive.apache.org and get their feedback. > 2. Ask if any developers on this mailing list are willing to take on the > maintenance effort. > > I'm concerned about user impact because I can still see issues being > reported on HoS from time to time. I'm more concerned about the future of > Hive if we narrow Hive neutrality on execution engines, which will possibly > force more Hive users to migrate to other alternatives such as Spark SQL, > which is already eroding Hive's user base. > > Being open and neutral used to be Hive's most admired strengths. > > Thanks, > Xuefu > > > On Wed, Jul 22, 2020 at 8:46 AM Alan Gates wrote: > > > An important point here is I don't believe David is proposing to remove > > Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing to > > support it in existing 2 and 3 lines makes sense, but since no one has > > maintained it on trunk for some time and it does not work with many of > the > > newer features it should be removed from trunk. > > > > Alan. > > > > On Tue, Jul 21, 2020 at 4:10 PM Chao Sun wrote: > > > > > Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a > very > > > large scale in production right now and I don't think we have any plan > to > > > change it soon. > > > > > > > > > > > > On Tue, Jul 21, 2020 at 11:28 AM David wrote: > > > > > > > Hello, > > > > > > > > Thanks for the feedback. > > > > > > > > Just a quick recap: I did propose this @dev and I received unanimous > > +1's > > > > from the community. After a couple months, I created the PR. > > > > > > > > Certainly open to discussion, but there hasn't been any discussion > thus > > > far > > > > because there have been no objections until this point. > > > > > > > > HoS has low adoption, heavy technical debt, and the manner in which > its > > > > build process is setup is impeding some other work that is not even > > > related > > > > to HoS. > > > > > > > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan would > be > > > to > > > > use Tez moving forward. > > > > > > > > My point about the vendor's move to Tez is that HoS adoption is very > > low, > > > > it's only going lower, and while I don't know the specifics of it, > > there > > > > must be some migration plan in place there (i.e., it must be possible > > to > > > do > > > > it already). > > > > > > > > Thanks, > > > > David > > > > > > > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang > wrote: > > > > > > > > > Hi David, > > > > > > > > > > While a vendor may not support a component in an open source > project, > > > > > removing it or not is a decision by and for the community. I > > certainly > > > > > understand that the vendor you mentioned has contributed a great > deal > > > > > (including my personal effort while working there), it's not up to > > the > > > > > vendor to make a call like what is proposed here. > > > > > > > > > > As a community, we should have gone through a thorough discussion > and > > > > > reached a consensus before actually making such a big change, in my > > > > > opinion. > > > > > > > > > > Thanks, > > > > > Xuefu > > > > > > > > > > On Tue, Jul 21, 2020 at 8:49 AM David wrote: > > > > > > > > > > > Hey, > > > > > > > > > > > > Thanks for the input. > > > > > > > > > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from > their > > > > latest > > > > > > offering. > > > > > > > > > > > > "Tez is now the only supported execution engine, existing queries > > > that > > > > > > change execution mode to Spark or MapReduce within a session, for > > > > > example, > > > > > > fail." > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > > > > > > > > > > > > > So I don't know who will be supporting this feature moving > forward, > > > but > > >
Re: Time to Remove Hive-on-Spark
Previous reasoning seemed to suggest a lack of user adoption. Now we are concerned about ongoing maintenance effort. Both are valid considerations. However, I think we should have ways to find out the answers. Therefore, I suggest the following be carried out: 1. Send out the proposal (removing Hive on Spark) to users including u...@hive.apache.org and get their feedback. 2. Ask if any developers on this mailing list are willing to take on the maintenance effort. I'm concerned about user impact because I can still see issues being reported on HoS from time to time. I'm more concerned about the future of Hive if we narrow Hive neutrality on execution engines, which will possibly force more Hive users to migrate to other alternatives such as Spark SQL, which is already eroding Hive's user base. Being open and neutral used to be Hive's most admired strengths. Thanks, Xuefu On Wed, Jul 22, 2020 at 8:46 AM Alan Gates wrote: > An important point here is I don't believe David is proposing to remove > Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing to > support it in existing 2 and 3 lines makes sense, but since no one has > maintained it on trunk for some time and it does not work with many of the > newer features it should be removed from trunk. > > Alan. > > On Tue, Jul 21, 2020 at 4:10 PM Chao Sun wrote: > > > Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a very > > large scale in production right now and I don't think we have any plan to > > change it soon. > > > > > > > > On Tue, Jul 21, 2020 at 11:28 AM David wrote: > > > > > Hello, > > > > > > Thanks for the feedback. > > > > > > Just a quick recap: I did propose this @dev and I received unanimous > +1's > > > from the community. After a couple months, I created the PR. > > > > > > Certainly open to discussion, but there hasn't been any discussion thus > > far > > > because there have been no objections until this point. > > > > > > HoS has low adoption, heavy technical debt, and the manner in which its > > > build process is setup is impeding some other work that is not even > > related > > > to HoS. > > > > > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan would be > > to > > > use Tez moving forward. > > > > > > My point about the vendor's move to Tez is that HoS adoption is very > low, > > > it's only going lower, and while I don't know the specifics of it, > there > > > must be some migration plan in place there (i.e., it must be possible > to > > do > > > it already). > > > > > > Thanks, > > > David > > > > > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang wrote: > > > > > > > Hi David, > > > > > > > > While a vendor may not support a component in an open source project, > > > > removing it or not is a decision by and for the community. I > certainly > > > > understand that the vendor you mentioned has contributed a great deal > > > > (including my personal effort while working there), it's not up to > the > > > > vendor to make a call like what is proposed here. > > > > > > > > As a community, we should have gone through a thorough discussion and > > > > reached a consensus before actually making such a big change, in my > > > > opinion. > > > > > > > > Thanks, > > > > Xuefu > > > > > > > > On Tue, Jul 21, 2020 at 8:49 AM David wrote: > > > > > > > > > Hey, > > > > > > > > > > Thanks for the input. > > > > > > > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their > > > latest > > > > > offering. > > > > > > > > > > "Tez is now the only supported execution engine, existing queries > > that > > > > > change execution mode to Spark or MapReduce within a session, for > > > > example, > > > > > fail." > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > > > > > > > > > > So I don't know who will be supporting this feature moving forward, > > but > > > > > there has been a lot of work done to make this change as painless > as > > > > > possible. Simply set the engine to 'tez' and remove the > HoS-related > > > > > settings should address many use cases. > > > > > > > > > > Thanks. > > > > > > > > > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z > wrote: > > > > > > > > > > > Sorry for chiming in late. However, I don't think we should > remove > > > Hive > > > > > on > > > > > > Spark just because of a technical problem. This is rather a big > > > > decision > > > > > > that we need to be careful about. There are users that will be > left > > > > high > > > > > > and dry by this move. > > > > > > > > > > > > If the community decides to desupport and eventually remove it, I > > > think > > > > > we > > > > > > need to have a due process. We also need a deprecation plan if > > that's > > > > we > > > > > > decide to do. Before that, I'm -1 on this proposal. > > > > > > > > > > > > Thanks, > > > > > > Xuefu > > > > > > > > > > > > On Tue, Jul 21, 2020 at 7:57 AM David wr
Re: Time to Remove Hive-on-Spark
An important point here is I don't believe David is proposing to remove Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing to support it in existing 2 and 3 lines makes sense, but since no one has maintained it on trunk for some time and it does not work with many of the newer features it should be removed from trunk. Alan. On Tue, Jul 21, 2020 at 4:10 PM Chao Sun wrote: > Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a very > large scale in production right now and I don't think we have any plan to > change it soon. > > > > On Tue, Jul 21, 2020 at 11:28 AM David wrote: > > > Hello, > > > > Thanks for the feedback. > > > > Just a quick recap: I did propose this @dev and I received unanimous +1's > > from the community. After a couple months, I created the PR. > > > > Certainly open to discussion, but there hasn't been any discussion thus > far > > because there have been no objections until this point. > > > > HoS has low adoption, heavy technical debt, and the manner in which its > > build process is setup is impeding some other work that is not even > related > > to HoS. > > > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan would be > to > > use Tez moving forward. > > > > My point about the vendor's move to Tez is that HoS adoption is very low, > > it's only going lower, and while I don't know the specifics of it, there > > must be some migration plan in place there (i.e., it must be possible to > do > > it already). > > > > Thanks, > > David > > > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang wrote: > > > > > Hi David, > > > > > > While a vendor may not support a component in an open source project, > > > removing it or not is a decision by and for the community. I certainly > > > understand that the vendor you mentioned has contributed a great deal > > > (including my personal effort while working there), it's not up to the > > > vendor to make a call like what is proposed here. > > > > > > As a community, we should have gone through a thorough discussion and > > > reached a consensus before actually making such a big change, in my > > > opinion. > > > > > > Thanks, > > > Xuefu > > > > > > On Tue, Jul 21, 2020 at 8:49 AM David wrote: > > > > > > > Hey, > > > > > > > > Thanks for the input. > > > > > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their > > latest > > > > offering. > > > > > > > > "Tez is now the only supported execution engine, existing queries > that > > > > change execution mode to Spark or MapReduce within a session, for > > > example, > > > > fail." > > > > > > > > > > > > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > > > > > > > So I don't know who will be supporting this feature moving forward, > but > > > > there has been a lot of work done to make this change as painless as > > > > possible. Simply set the engine to 'tez' and remove the HoS-related > > > > settings should address many use cases. > > > > > > > > Thanks. > > > > > > > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z wrote: > > > > > > > > > Sorry for chiming in late. However, I don't think we should remove > > Hive > > > > on > > > > > Spark just because of a technical problem. This is rather a big > > > decision > > > > > that we need to be careful about. There are users that will be left > > > high > > > > > and dry by this move. > > > > > > > > > > If the community decides to desupport and eventually remove it, I > > think > > > > we > > > > > need to have a due process. We also need a deprecation plan if > that's > > > we > > > > > decide to do. Before that, I'm -1 on this proposal. > > > > > > > > > > Thanks, > > > > > Xuefu > > > > > > > > > > On Tue, Jul 21, 2020 at 7:57 AM David wrote: > > > > > > > > > > > Hello Team, > > > > > > > > > > > > https://github.com/apache/hive/pull/1285 > > > > > > > > > > > > Thanks. > > > > > > > > > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V > wrote: > > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > Cheers, > > > > > > > Gopal > > > > > > > > > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > > > > > +1 > > > > > > > > > > > > > > > > -Jesús > > > > > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates < > > alanfga...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > >> +1. > > > > > > > >> > > > > > > > >> Alan. > > > > > > > >> > > > > > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >>> +1 > > > > > > > >>> > > > > > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > > > > > hashut...@apache.org> > > > > > > > >>> wrote: > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > > > > dam6...@gmail.com> > > > > > > > >> wrote: > > > > > > > > > > > > > > > Hello Gang, > > > > > > > > > > > > > > > >
Re: Time to Remove Hive-on-Spark
Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a very large scale in production right now and I don't think we have any plan to change it soon. On Tue, Jul 21, 2020 at 11:28 AM David wrote: > Hello, > > Thanks for the feedback. > > Just a quick recap: I did propose this @dev and I received unanimous +1's > from the community. After a couple months, I created the PR. > > Certainly open to discussion, but there hasn't been any discussion thus far > because there have been no objections until this point. > > HoS has low adoption, heavy technical debt, and the manner in which its > build process is setup is impeding some other work that is not even related > to HoS. > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan would be to > use Tez moving forward. > > My point about the vendor's move to Tez is that HoS adoption is very low, > it's only going lower, and while I don't know the specifics of it, there > must be some migration plan in place there (i.e., it must be possible to do > it already). > > Thanks, > David > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang wrote: > > > Hi David, > > > > While a vendor may not support a component in an open source project, > > removing it or not is a decision by and for the community. I certainly > > understand that the vendor you mentioned has contributed a great deal > > (including my personal effort while working there), it's not up to the > > vendor to make a call like what is proposed here. > > > > As a community, we should have gone through a thorough discussion and > > reached a consensus before actually making such a big change, in my > > opinion. > > > > Thanks, > > Xuefu > > > > On Tue, Jul 21, 2020 at 8:49 AM David wrote: > > > > > Hey, > > > > > > Thanks for the input. > > > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their > latest > > > offering. > > > > > > "Tez is now the only supported execution engine, existing queries that > > > change execution mode to Spark or MapReduce within a session, for > > example, > > > fail." > > > > > > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > > > > So I don't know who will be supporting this feature moving forward, but > > > there has been a lot of work done to make this change as painless as > > > possible. Simply set the engine to 'tez' and remove the HoS-related > > > settings should address many use cases. > > > > > > Thanks. > > > > > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z wrote: > > > > > > > Sorry for chiming in late. However, I don't think we should remove > Hive > > > on > > > > Spark just because of a technical problem. This is rather a big > > decision > > > > that we need to be careful about. There are users that will be left > > high > > > > and dry by this move. > > > > > > > > If the community decides to desupport and eventually remove it, I > think > > > we > > > > need to have a due process. We also need a deprecation plan if that's > > we > > > > decide to do. Before that, I'm -1 on this proposal. > > > > > > > > Thanks, > > > > Xuefu > > > > > > > > On Tue, Jul 21, 2020 at 7:57 AM David wrote: > > > > > > > > > Hello Team, > > > > > > > > > > https://github.com/apache/hive/pull/1285 > > > > > > > > > > Thanks. > > > > > > > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V wrote: > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > Cheers, > > > > > > Gopal > > > > > > > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > > > > +1 > > > > > > > > > > > > > > -Jesús > > > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates < > alanfga...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > >> +1. > > > > > > >> > > > > > > >> Alan. > > > > > > >> > > > > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > > > > >> wrote: > > > > > > >> > > > > > > >>> +1 > > > > > > >>> > > > > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > > > > hashut...@apache.org> > > > > > > >>> wrote: > > > > > > > > > > > > +1 > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > > > dam6...@gmail.com> > > > > > > >> wrote: > > > > > > > > > > > > > Hello Gang, > > > > > > > > > > > > > > I have spent some time working on upgrading Avro (far less > > than > > > > > > >> others): > > > > > > > > > > > > > > https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > > > > > > > > > This should be a relatively easy thing to do, but is > blocked > > by > > > > > > > Hive-on-Spark. HoS has a weird thing where it downloads > some > > > > > > > cloud-storage-hosted file of Spark-Hadoop as part of its > > maven > > > > run. > > > > > > > > > > > > > > Since HoS is not going to receive updates from the major > > > vendors, > > > > > is > > > > > > >> it > > > > > > > time to simply remove it? > > > > > > > > > > > > > > Tests are curren
Re: Time to Remove Hive-on-Spark
Hello, Thanks for the feedback. Just a quick recap: I did propose this @dev and I received unanimous +1's from the community. After a couple months, I created the PR. Certainly open to discussion, but there hasn't been any discussion thus far because there have been no objections until this point. HoS has low adoption, heavy technical debt, and the manner in which its build process is setup is impeding some other work that is not even related to HoS. We can deprecate in Hive 3.x and remove in Hive 4.x. The plan would be to use Tez moving forward. My point about the vendor's move to Tez is that HoS adoption is very low, it's only going lower, and while I don't know the specifics of it, there must be some migration plan in place there (i.e., it must be possible to do it already). Thanks, David On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang wrote: > Hi David, > > While a vendor may not support a component in an open source project, > removing it or not is a decision by and for the community. I certainly > understand that the vendor you mentioned has contributed a great deal > (including my personal effort while working there), it's not up to the > vendor to make a call like what is proposed here. > > As a community, we should have gone through a thorough discussion and > reached a consensus before actually making such a big change, in my > opinion. > > Thanks, > Xuefu > > On Tue, Jul 21, 2020 at 8:49 AM David wrote: > > > Hey, > > > > Thanks for the input. > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their latest > > offering. > > > > "Tez is now the only supported execution engine, existing queries that > > change execution mode to Spark or MapReduce within a session, for > example, > > fail." > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > So I don't know who will be supporting this feature moving forward, but > > there has been a lot of work done to make this change as painless as > > possible. Simply set the engine to 'tez' and remove the HoS-related > > settings should address many use cases. > > > > Thanks. > > > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z wrote: > > > > > Sorry for chiming in late. However, I don't think we should remove Hive > > on > > > Spark just because of a technical problem. This is rather a big > decision > > > that we need to be careful about. There are users that will be left > high > > > and dry by this move. > > > > > > If the community decides to desupport and eventually remove it, I think > > we > > > need to have a due process. We also need a deprecation plan if that's > we > > > decide to do. Before that, I'm -1 on this proposal. > > > > > > Thanks, > > > Xuefu > > > > > > On Tue, Jul 21, 2020 at 7:57 AM David wrote: > > > > > > > Hello Team, > > > > > > > > https://github.com/apache/hive/pull/1285 > > > > > > > > Thanks. > > > > > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V wrote: > > > > > > > > > > > > > > +1 > > > > > > > > > > Cheers, > > > > > Gopal > > > > > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > > > +1 > > > > > > > > > > > > -Jesús > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates > > > > wrote: > > > > > > > > > > > >> +1. > > > > > >> > > > > > >> Alan. > > > > > >> > > > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > > > >> wrote: > > > > > >> > > > > > >>> +1 > > > > > >>> > > > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > > > hashut...@apache.org> > > > > > >>> wrote: > > > > > > > > > > +1 > > > > > > > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > > dam6...@gmail.com> > > > > > >> wrote: > > > > > > > > > > > Hello Gang, > > > > > > > > > > > > I have spent some time working on upgrading Avro (far less > than > > > > > >> others): > > > > > > > > > > > > https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > > > > > > > This should be a relatively easy thing to do, but is blocked > by > > > > > > Hive-on-Spark. HoS has a weird thing where it downloads some > > > > > > cloud-storage-hosted file of Spark-Hadoop as part of its > maven > > > run. > > > > > > > > > > > > Since HoS is not going to receive updates from the major > > vendors, > > > > is > > > > > >> it > > > > > > time to simply remove it? > > > > > > > > > > > > Tests are currently disabled: > > > > > > https://issues.apache.org/jira/browse/HIVE-23137 > > > > > > > > > > > > Thanks. > > > > > > > > > > > >>> > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Xuefu Zhang > > > > > > "In Honey We Trust!" > > > > > >
Re: Time to Remove Hive-on-Spark
Hi David, While a vendor may not support a component in an open source project, removing it or not is a decision by and for the community. I certainly understand that the vendor you mentioned has contributed a great deal (including my personal effort while working there), it's not up to the vendor to make a call like what is proposed here. As a community, we should have gone through a thorough discussion and reached a consensus before actually making such a big change, in my opinion. Thanks, Xuefu On Tue, Jul 21, 2020 at 8:49 AM David wrote: > Hey, > > Thanks for the input. > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their latest > offering. > > "Tez is now the only supported execution engine, existing queries that > change execution mode to Spark or MapReduce within a session, for example, > fail." > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > So I don't know who will be supporting this feature moving forward, but > there has been a lot of work done to make this change as painless as > possible. Simply set the engine to 'tez' and remove the HoS-related > settings should address many use cases. > > Thanks. > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z wrote: > > > Sorry for chiming in late. However, I don't think we should remove Hive > on > > Spark just because of a technical problem. This is rather a big decision > > that we need to be careful about. There are users that will be left high > > and dry by this move. > > > > If the community decides to desupport and eventually remove it, I think > we > > need to have a due process. We also need a deprecation plan if that's we > > decide to do. Before that, I'm -1 on this proposal. > > > > Thanks, > > Xuefu > > > > On Tue, Jul 21, 2020 at 7:57 AM David wrote: > > > > > Hello Team, > > > > > > https://github.com/apache/hive/pull/1285 > > > > > > Thanks. > > > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V wrote: > > > > > > > > > > > +1 > > > > > > > > Cheers, > > > > Gopal > > > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > > +1 > > > > > > > > > > -Jesús > > > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates > > > wrote: > > > > > > > > > >> +1. > > > > >> > > > > >> Alan. > > > > >> > > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > > >> wrote: > > > > >> > > > > >>> +1 > > > > >>> > > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > > hashut...@apache.org> > > > > >>> wrote: > > > > > > > > +1 > > > > > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > dam6...@gmail.com> > > > > >> wrote: > > > > > > > > > Hello Gang, > > > > > > > > > > I have spent some time working on upgrading Avro (far less than > > > > >> others): > > > > > > > > > > https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > > > > > This should be a relatively easy thing to do, but is blocked by > > > > > Hive-on-Spark. HoS has a weird thing where it downloads some > > > > > cloud-storage-hosted file of Spark-Hadoop as part of its maven > > run. > > > > > > > > > > Since HoS is not going to receive updates from the major > vendors, > > > is > > > > >> it > > > > > time to simply remove it? > > > > > > > > > > Tests are currently disabled: > > > > > https://issues.apache.org/jira/browse/HIVE-23137 > > > > > > > > > > Thanks. > > > > > > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > > > > > > -- > > Xuefu Zhang > > > > "In Honey We Trust!" > > >
Re: Time to Remove Hive-on-Spark
Hey, Thanks for the input. FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from their latest offering. "Tez is now the only supported execution engine, existing queries that change execution mode to Spark or MapReduce within a session, for example, fail." https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html So I don't know who will be supporting this feature moving forward, but there has been a lot of work done to make this change as painless as possible. Simply set the engine to 'tez' and remove the HoS-related settings should address many use cases. Thanks. On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z wrote: > Sorry for chiming in late. However, I don't think we should remove Hive on > Spark just because of a technical problem. This is rather a big decision > that we need to be careful about. There are users that will be left high > and dry by this move. > > If the community decides to desupport and eventually remove it, I think we > need to have a due process. We also need a deprecation plan if that's we > decide to do. Before that, I'm -1 on this proposal. > > Thanks, > Xuefu > > On Tue, Jul 21, 2020 at 7:57 AM David wrote: > > > Hello Team, > > > > https://github.com/apache/hive/pull/1285 > > > > Thanks. > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V wrote: > > > > > > > > +1 > > > > > > Cheers, > > > Gopal > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > +1 > > > > > > > > -Jesús > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates > > wrote: > > > > > > > >> +1. > > > >> > > > >> Alan. > > > >> > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > >> wrote: > > > >> > > > >>> +1 > > > >>> > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > hashut...@apache.org> > > > >>> wrote: > > > > > > +1 > > > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor > > > >> wrote: > > > > > > > Hello Gang, > > > > > > > > I have spent some time working on upgrading Avro (far less than > > > >> others): > > > > > > > > https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > > > This should be a relatively easy thing to do, but is blocked by > > > > Hive-on-Spark. HoS has a weird thing where it downloads some > > > > cloud-storage-hosted file of Spark-Hadoop as part of its maven > run. > > > > > > > > Since HoS is not going to receive updates from the major vendors, > > is > > > >> it > > > > time to simply remove it? > > > > > > > > Tests are currently disabled: > > > > https://issues.apache.org/jira/browse/HIVE-23137 > > > > > > > > Thanks. > > > > > > > >>> > > > >>> > > > >> > > > > > > > > > > > > -- > Xuefu Zhang > > "In Honey We Trust!" >
Re: Time to Remove Hive-on-Spark
Sorry for chiming in late. However, I don't think we should remove Hive on Spark just because of a technical problem. This is rather a big decision that we need to be careful about. There are users that will be left high and dry by this move. If the community decides to desupport and eventually remove it, I think we need to have a due process. We also need a deprecation plan if that's we decide to do. Before that, I'm -1 on this proposal. Thanks, Xuefu On Tue, Jul 21, 2020 at 7:57 AM David wrote: > Hello Team, > > https://github.com/apache/hive/pull/1285 > > Thanks. > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V wrote: > > > > > +1 > > > > Cheers, > > Gopal > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > +1 > > > > > > -Jesús > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates > wrote: > > > > > >> +1. > > >> > > >> Alan. > > >> > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > >> wrote: > > >> > > >>> +1 > > >>> > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan > > >>> wrote: > > > > +1 > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor > > >> wrote: > > > > > Hello Gang, > > > > > > I have spent some time working on upgrading Avro (far less than > > >> others): > > > > > > https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > This should be a relatively easy thing to do, but is blocked by > > > Hive-on-Spark. HoS has a weird thing where it downloads some > > > cloud-storage-hosted file of Spark-Hadoop as part of its maven run. > > > > > > Since HoS is not going to receive updates from the major vendors, > is > > >> it > > > time to simply remove it? > > > > > > Tests are currently disabled: > > > https://issues.apache.org/jira/browse/HIVE-23137 > > > > > > Thanks. > > > > > >>> > > >>> > > >> > > > > > > -- Xuefu Zhang "In Honey We Trust!"
Re: Time to Remove Hive-on-Spark
Hello Team, https://github.com/apache/hive/pull/1285 Thanks. On Wed, Jun 3, 2020 at 11:49 PM Gopal V wrote: > > +1 > > Cheers, > Gopal > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > +1 > > > > -Jesús > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates wrote: > > > >> +1. > >> > >> Alan. > >> > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > >> wrote: > >> > >>> +1 > >>> > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan > >>> wrote: > > +1 > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor > >> wrote: > > > Hello Gang, > > > > I have spent some time working on upgrading Avro (far less than > >> others): > > > > https://issues.apache.org/jira/browse/HIVE-21737 > > > > This should be a relatively easy thing to do, but is blocked by > > Hive-on-Spark. HoS has a weird thing where it downloads some > > cloud-storage-hosted file of Spark-Hadoop as part of its maven run. > > > > Since HoS is not going to receive updates from the major vendors, is > >> it > > time to simply remove it? > > > > Tests are currently disabled: > > https://issues.apache.org/jira/browse/HIVE-23137 > > > > Thanks. > > > >>> > >>> > >> > > >
Re: Time to Remove Hive-on-Spark
+1 Cheers, Gopal On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: +1 -Jesús On Wed, Jun 3, 2020 at 1:58 PM Alan Gates wrote: +1. Alan. On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran wrote: +1 On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan wrote: +1 On Wed, Jun 3, 2020 at 1:23 PM David Mollitor wrote: Hello Gang, I have spent some time working on upgrading Avro (far less than others): https://issues.apache.org/jira/browse/HIVE-21737 This should be a relatively easy thing to do, but is blocked by Hive-on-Spark. HoS has a weird thing where it downloads some cloud-storage-hosted file of Spark-Hadoop as part of its maven run. Since HoS is not going to receive updates from the major vendors, is it time to simply remove it? Tests are currently disabled: https://issues.apache.org/jira/browse/HIVE-23137 Thanks.
Re: Time to Remove Hive-on-Spark
+1 -Jesús On Wed, Jun 3, 2020 at 1:58 PM Alan Gates wrote: > +1. > > Alan. > > On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > wrote: > > > +1 > > > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan > > wrote: > > > > > > +1 > > > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor > wrote: > > > > > >> Hello Gang, > > >> > > >> I have spent some time working on upgrading Avro (far less than > others): > > >> > > >> https://issues.apache.org/jira/browse/HIVE-21737 > > >> > > >> This should be a relatively easy thing to do, but is blocked by > > >> Hive-on-Spark. HoS has a weird thing where it downloads some > > >> cloud-storage-hosted file of Spark-Hadoop as part of its maven run. > > >> > > >> Since HoS is not going to receive updates from the major vendors, is > it > > >> time to simply remove it? > > >> > > >> Tests are currently disabled: > > >> https://issues.apache.org/jira/browse/HIVE-23137 > > >> > > >> Thanks. > > >> > > > > >
Re: Time to Remove Hive-on-Spark
+1. Alan. On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran wrote: > +1 > > > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan > wrote: > > > > +1 > > > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor wrote: > > > >> Hello Gang, > >> > >> I have spent some time working on upgrading Avro (far less than others): > >> > >> https://issues.apache.org/jira/browse/HIVE-21737 > >> > >> This should be a relatively easy thing to do, but is blocked by > >> Hive-on-Spark. HoS has a weird thing where it downloads some > >> cloud-storage-hosted file of Spark-Hadoop as part of its maven run. > >> > >> Since HoS is not going to receive updates from the major vendors, is it > >> time to simply remove it? > >> > >> Tests are currently disabled: > >> https://issues.apache.org/jira/browse/HIVE-23137 > >> > >> Thanks. > >> > >
Re: Time to Remove Hive-on-Spark
+1 > On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan wrote: > > +1 > > On Wed, Jun 3, 2020 at 1:23 PM David Mollitor wrote: > >> Hello Gang, >> >> I have spent some time working on upgrading Avro (far less than others): >> >> https://issues.apache.org/jira/browse/HIVE-21737 >> >> This should be a relatively easy thing to do, but is blocked by >> Hive-on-Spark. HoS has a weird thing where it downloads some >> cloud-storage-hosted file of Spark-Hadoop as part of its maven run. >> >> Since HoS is not going to receive updates from the major vendors, is it >> time to simply remove it? >> >> Tests are currently disabled: >> https://issues.apache.org/jira/browse/HIVE-23137 >> >> Thanks. >>
Re: Time to Remove Hive-on-Spark
+1 On Wed, Jun 3, 2020 at 1:23 PM David Mollitor wrote: > Hello Gang, > > I have spent some time working on upgrading Avro (far less than others): > > https://issues.apache.org/jira/browse/HIVE-21737 > > This should be a relatively easy thing to do, but is blocked by > Hive-on-Spark. HoS has a weird thing where it downloads some > cloud-storage-hosted file of Spark-Hadoop as part of its maven run. > > Since HoS is not going to receive updates from the major vendors, is it > time to simply remove it? > > Tests are currently disabled: > https://issues.apache.org/jira/browse/HIVE-23137 > > Thanks. >
Time to Remove Hive-on-Spark
Hello Gang, I have spent some time working on upgrading Avro (far less than others): https://issues.apache.org/jira/browse/HIVE-21737 This should be a relatively easy thing to do, but is blocked by Hive-on-Spark. HoS has a weird thing where it downloads some cloud-storage-hosted file of Spark-Hadoop as part of its maven run. Since HoS is not going to receive updates from the major vendors, is it time to simply remove it? Tests are currently disabled: https://issues.apache.org/jira/browse/HIVE-23137 Thanks.