I do not know how it works for most of the world. But in cloudera where the TEZ options were never popular hive-on-spark represents a solid way to get things done for small datasets lower latency.
As for the spark adoption. You know a while ago I came up with some ways to make hive more spark like. One of them was a found a way to make "compile" a hive keyword so folks could build UDFs on the fly. It was such an uphil climb. Folks found a way to make it disabled by default for security. Then later when things moved from CLI to beeline it was like the ONLY thing that I found not ported. Like it was extremely frustrating. On Mon, Jul 27, 2020 at 3:19 PM David <dam6...@gmail.com> wrote: > Hello Xuefu, > > I am not part of the Cloudera Hive product team, though I volunteer to > work on small projects from time to time. Perhaps someone from that team > can chime in with some of their thoughts, but personally, I think that in > the long run, there will be more of a merge between Hive-on-Spark and other > Spark-native offerings. I'm not sure what the differentiation will be > going forward. With that said, are there any developers on this mailing > list who are willing to take on the maintenance effort of keeping HoS > moving forward? > > http://www.russellspitzer.com/2017/05/19/Spark-Sql-Thriftserver/ > > https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/config-sts.html > > > Thanks. > > On Thu, Jul 23, 2020 at 12:35 PM Xuefu Zhang <xu...@apache.org> wrote: > > > Previous reasoning seemed to suggest a lack of user adoption. Now we are > > concerned about ongoing maintenance effort. Both are valid > considerations. > > However, I think we should have ways to find out the answers. Therefore, > I > > suggest the following be carried out: > > > > 1. Send out the proposal (removing Hive on Spark) to users including > > u...@hive.apache.org and get their feedback. > > 2. Ask if any developers on this mailing list are willing to take on the > > maintenance effort. > > > > I'm concerned about user impact because I can still see issues being > > reported on HoS from time to time. I'm more concerned about the future of > > Hive if we narrow Hive neutrality on execution engines, which will > possibly > > force more Hive users to migrate to other alternatives such as Spark SQL, > > which is already eroding Hive's user base. > > > > Being open and neutral used to be Hive's most admired strengths. > > > > Thanks, > > Xuefu > > > > > > On Wed, Jul 22, 2020 at 8:46 AM Alan Gates <alanfga...@gmail.com> wrote: > > > > > An important point here is I don't believe David is proposing to remove > > > Hive on Spark from the 2 or 3 lines, but only from trunk. Continuing > to > > > support it in existing 2 and 3 lines makes sense, but since no one has > > > maintained it on trunk for some time and it does not work with many of > > the > > > newer features it should be removed from trunk. > > > > > > Alan. > > > > > > On Tue, Jul 21, 2020 at 4:10 PM Chao Sun <sunc...@apache.org> wrote: > > > > > > > Thanks David. FWIW Uber is still running Hive on Spark (2.3.4) on a > > very > > > > large scale in production right now and I don't think we have any > plan > > to > > > > change it soon. > > > > > > > > > > > > > > > > On Tue, Jul 21, 2020 at 11:28 AM David <dam6...@gmail.com> wrote: > > > > > > > > > Hello, > > > > > > > > > > Thanks for the feedback. > > > > > > > > > > Just a quick recap: I did propose this @dev and I received > unanimous > > > +1's > > > > > from the community. After a couple months, I created the PR. > > > > > > > > > > Certainly open to discussion, but there hasn't been any discussion > > thus > > > > far > > > > > because there have been no objections until this point. > > > > > > > > > > HoS has low adoption, heavy technical debt, and the manner in which > > its > > > > > build process is setup is impeding some other work that is not even > > > > related > > > > > to HoS. > > > > > > > > > > We can deprecate in Hive 3.x and remove in Hive 4.x. The plan > would > > be > > > > to > > > > > use Tez moving forward. > > > > > > > > > > My point about the vendor's move to Tez is that HoS adoption is > very > > > low, > > > > > it's only going lower, and while I don't know the specifics of it, > > > there > > > > > must be some migration plan in place there (i.e., it must be > possible > > > to > > > > do > > > > > it already). > > > > > > > > > > Thanks, > > > > > David > > > > > > > > > > On Tue, Jul 21, 2020 at 12:23 PM Xuefu Zhang <xu...@apache.org> > > wrote: > > > > > > > > > > > Hi David, > > > > > > > > > > > > While a vendor may not support a component in an open source > > project, > > > > > > removing it or not is a decision by and for the community. I > > > certainly > > > > > > understand that the vendor you mentioned has contributed a great > > deal > > > > > > (including my personal effort while working there), it's not up > to > > > the > > > > > > vendor to make a call like what is proposed here. > > > > > > > > > > > > As a community, we should have gone through a thorough discussion > > and > > > > > > reached a consensus before actually making such a big change, in > my > > > > > > opinion. > > > > > > > > > > > > Thanks, > > > > > > Xuefu > > > > > > > > > > > > On Tue, Jul 21, 2020 at 8:49 AM David <dam6...@gmail.com> wrote: > > > > > > > > > > > > > Hey, > > > > > > > > > > > > > > Thanks for the input. > > > > > > > > > > > > > > FYI. Cloudera (Cloudera + Hortonworks) have removed HoS from > > their > > > > > latest > > > > > > > offering. > > > > > > > > > > > > > > "Tez is now the only supported execution engine, existing > queries > > > > that > > > > > > > change execution mode to Spark or MapReduce within a session, > for > > > > > > example, > > > > > > > fail." > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.cloudera.com/cdp/latest/upgrade-post/topics/ug_hive_configuration_changes.html > > > > > > > > > > > > > > > > > > > > > So I don't know who will be supporting this feature moving > > forward, > > > > but > > > > > > > there has been a lot of work done to make this change as > painless > > > as > > > > > > > possible. Simply set the engine to 'tez' and remove the > > > HoS-related > > > > > > > settings should address many use cases. > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > On Tue, Jul 21, 2020 at 11:36 AM Xuefu Z <usxu...@gmail.com> > > > wrote: > > > > > > > > > > > > > > > Sorry for chiming in late. However, I don't think we should > > > remove > > > > > Hive > > > > > > > on > > > > > > > > Spark just because of a technical problem. This is rather a > big > > > > > > decision > > > > > > > > that we need to be careful about. There are users that will > be > > > left > > > > > > high > > > > > > > > and dry by this move. > > > > > > > > > > > > > > > > If the community decides to desupport and eventually remove > > it, I > > > > > think > > > > > > > we > > > > > > > > need to have a due process. We also need a deprecation plan > if > > > > that's > > > > > > we > > > > > > > > decide to do. Before that, I'm -1 on this proposal. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Xuefu > > > > > > > > > > > > > > > > On Tue, Jul 21, 2020 at 7:57 AM David <dam6...@gmail.com> > > wrote: > > > > > > > > > > > > > > > > > Hello Team, > > > > > > > > > > > > > > > > > > https://github.com/apache/hive/pull/1285 > > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > On Wed, Jun 3, 2020 at 11:49 PM Gopal V <gop...@apache.org > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > Gopal > > > > > > > > > > > > > > > > > > > > On 6/3/20 7:48 PM, Jesus Camacho Rodriguez wrote: > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > > > > > -Jesús > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 3, 2020 at 1:58 PM Alan Gates < > > > > > alanfga...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > >> +1. > > > > > > > > > > >> > > > > > > > > > > >> Alan. > > > > > > > > > > >> > > > > > > > > > > >> On Wed, Jun 3, 2020 at 1:40 PM Prasanth Jayachandran > > > > > > > > > > >> <pjayachand...@cloudera.com.invalid> wrote: > > > > > > > > > > >> > > > > > > > > > > >>> +1 > > > > > > > > > > >>> > > > > > > > > > > >>>> On Jun 3, 2020, at 1:38 PM, Ashutosh Chauhan < > > > > > > > > hashut...@apache.org> > > > > > > > > > > >>> wrote: > > > > > > > > > > >>>> > > > > > > > > > > >>>> +1 > > > > > > > > > > >>>> > > > > > > > > > > >>>> On Wed, Jun 3, 2020 at 1:23 PM David Mollitor < > > > > > > > dam6...@gmail.com> > > > > > > > > > > >> wrote: > > > > > > > > > > >>>> > > > > > > > > > > >>>>> Hello Gang, > > > > > > > > > > >>>>> > > > > > > > > > > >>>>> I have spent some time working on upgrading Avro > (far > > > > less > > > > > > than > > > > > > > > > > >> others): > > > > > > > > > > >>>>> > > > > > > > > > > >>>>> https://issues.apache.org/jira/browse/HIVE-21737 > > > > > > > > > > >>>>> > > > > > > > > > > >>>>> This should be a relatively easy thing to do, but > is > > > > > blocked > > > > > > by > > > > > > > > > > >>>>> Hive-on-Spark. HoS has a weird thing where it > > > downloads > > > > > some > > > > > > > > > > >>>>> cloud-storage-hosted file of Spark-Hadoop as part > of > > > its > > > > > > maven > > > > > > > > run. > > > > > > > > > > >>>>> > > > > > > > > > > >>>>> Since HoS is not going to receive updates from the > > > major > > > > > > > vendors, > > > > > > > > > is > > > > > > > > > > >> it > > > > > > > > > > >>>>> time to simply remove it? > > > > > > > > > > >>>>> > > > > > > > > > > >>>>> Tests are currently disabled: > > > > > > > > > > >>>>> https://issues.apache.org/jira/browse/HIVE-23137 > > > > > > > > > > >>>>> > > > > > > > > > > >>>>> Thanks. > > > > > > > > > > >>>>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Xuefu Zhang > > > > > > > > > > > > > > > > "In Honey We Trust!" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >