Re: [DISCUSS] Nightly snaphot builds
I created https://issues.apache.org/jira/browse/HIVE-27371 to have nightly builds for branch-3. Once that is merged, I think we can have scheduled builds for branch-3 as well. Although, I don't have permissions to create a new job for branch-3. Does anyone know how to do it? Thanks, Vihang On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar wrote: > The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can > we have this for branch-3 as well since we have been backporting a lot of > PRs to branch-3 lately. > > Thanks, > Vihang > > > > > > On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: > >> Hey, >> >> > We already have nightly builds for Hive [1]. >> > [1] http://ci.hive.apache.org/job/hive-nightly/ >> >> ...and hive-dev-box can launch such archives; either by using it like >> this: >> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html >> >> or with a somewhat longer command you could launch hdb in bazaar mode; >> and have an HS2 running with a nightly version: >> >> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e >> HIVE_VERSION= >> http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz >> --name hive >> kgyrtkirk/hive-dev-box:bazaar >> >> cheers, >> Zoltan >> >> On 5/24/23 09:15, Stamatis Zampetakis wrote: >> > Hey all, >> > >> > We already have nightly builds for Hive [1]. >> > >> > Do we need something more than that? >> > >> > Best, >> > Stamatis >> > >> > [1] http://ci.hive.apache.org/job/hive-nightly/ >> > >> > >> > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar < >> vihan...@apache.org> wrote: >> >> >> >> I think there are many benefits like others in this thread suggested >> which >> >> can be built on top of nightly builds. Having docker images is great >> but >> >> for now I think we can start simple and publish the jars. Many users >> still >> >> just deploy using jars and it would be useful to them. Once we have a >> >> docker environment we can add a docker image too to the nightly builds >> so >> >> that users can choose their preferred way. >> >> >> >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park >> wrote: >> >> >> >>> I think such nightly builds will be useful for testing and debugging >> in the >> >>> future. >> >>> >> >>> I also wonder if we can somehow create builds even from previous >> commits >> >>> (e.g., for the past few years). Such builds from previous commits >> don't >> >>> have to be daily builds, and I think weekly builds (or even monthly >> builds) >> >>> would also be very useful. >> >>> >> >>> The reason I wish such builds were available is to facilitate >> debugging and >> >>> testing. When tested against the TPC-DS benchmark, the current master >> >>> branch has several correctness problems that were introduced after the >> >>> release of Hive 3.1.2. We have reported all problems known to us in >> [1] and >> >>> also submitted several patches. If such nightly builds had been >> available, >> >>> we would have saved quite a bit of time for implementing the patches >> by >> >>> quickly finding offending commits that introduced new correctness >> bugs. >> >>> >> >>> In addition, you can find quite a few commits in the master branch >> that >> >>> report bugs which are not reproduced in Hive 3.1.2. Examples: >> HIVE-19990, >> >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, >> >>> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, >> >>> HIVE-25170, HIVE-25864, HIVE-26671. >> >>> (There may be some errors in this list because we compared against >> Hive >> >>> 3.1.2 with many patches backported.) Such nightly builds can be >> useful for >> >>> finding root causes of such bugs. >> >>> >> >>> Ideally I wish there was an automated procedure to create nightly >> builds, >> >>> run TPC-DS benchmark, and report correctness/performance results, >> although >> >>> this would be quite hard to implement. (I remember Spark implemented >> this >> >>> procedure in the era of Spark 2, but my memory could be wrong.) >> >>> >> >>> [1] https://issues.apache.org/jira/browse/HIVE-26654 >> >>> >> >>> >> >>> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena >> wrote: >> >>> >> Hi Vihang, >> +1, We were even exploring publishing the docker images of the >> snapshot >> version as well per commit or maybe weekly, so just shoot 2 docker >> >>> commands >> and you get a Hive cluster running with master code. >> >> Sai, I think to spin up an env via Docker with all these things >> should be >> doable for sure, but would require someone with real good expertise >> with >> docker as well as setting up these services with Hive. Obviously, I >> am >> >>> not >> that guy :-) >> >> @Simhadri has a PR which publishes docker images once a release tag >> is >> pushed, you can explore to have similar stuff for the Snapshot >> version, >> maybe if that sounds cool >>
Re: [DISCUSS] Nightly snaphot builds
The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can we have this for branch-3 as well since we have been backporting a lot of PRs to branch-3 lately. Thanks, Vihang On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: > Hey, > > > We already have nightly builds for Hive [1]. > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > ...and hive-dev-box can launch such archives; either by using it like this: > https://www.mail-archive.com/dev@hive.apache.org/msg142420.html > > or with a somewhat longer command you could launch hdb in bazaar mode; and > have an HS2 running with a nightly version: > > docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e > HIVE_VERSION= > http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz > --name hive > kgyrtkirk/hive-dev-box:bazaar > > cheers, > Zoltan > > On 5/24/23 09:15, Stamatis Zampetakis wrote: > > Hey all, > > > > We already have nightly builds for Hive [1]. > > > > Do we need something more than that? > > > > Best, > > Stamatis > > > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > > > > > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar > wrote: > >> > >> I think there are many benefits like others in this thread suggested > which > >> can be built on top of nightly builds. Having docker images is great but > >> for now I think we can start simple and publish the jars. Many users > still > >> just deploy using jars and it would be useful to them. Once we have a > >> docker environment we can add a docker image too to the nightly builds > so > >> that users can choose their preferred way. > >> > >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park > wrote: > >> > >>> I think such nightly builds will be useful for testing and debugging > in the > >>> future. > >>> > >>> I also wonder if we can somehow create builds even from previous > commits > >>> (e.g., for the past few years). Such builds from previous commits don't > >>> have to be daily builds, and I think weekly builds (or even monthly > builds) > >>> would also be very useful. > >>> > >>> The reason I wish such builds were available is to facilitate > debugging and > >>> testing. When tested against the TPC-DS benchmark, the current master > >>> branch has several correctness problems that were introduced after the > >>> release of Hive 3.1.2. We have reported all problems known to us in > [1] and > >>> also submitted several patches. If such nightly builds had been > available, > >>> we would have saved quite a bit of time for implementing the patches by > >>> quickly finding offending commits that introduced new correctness bugs. > >>> > >>> In addition, you can find quite a few commits in the master branch that > >>> report bugs which are not reproduced in Hive 3.1.2. Examples: > HIVE-19990, > >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > >>> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > >>> HIVE-25170, HIVE-25864, HIVE-26671. > >>> (There may be some errors in this list because we compared against Hive > >>> 3.1.2 with many patches backported.) Such nightly builds can be useful > for > >>> finding root causes of such bugs. > >>> > >>> Ideally I wish there was an automated procedure to create nightly > builds, > >>> run TPC-DS benchmark, and report correctness/performance results, > although > >>> this would be quite hard to implement. (I remember Spark implemented > this > >>> procedure in the era of Spark 2, but my memory could be wrong.) > >>> > >>> [1] https://issues.apache.org/jira/browse/HIVE-26654 > >>> > >>> > >>> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena > wrote: > >>> > Hi Vihang, > +1, We were even exploring publishing the docker images of the > snapshot > version as well per commit or maybe weekly, so just shoot 2 docker > >>> commands > and you get a Hive cluster running with master code. > > Sai, I think to spin up an env via Docker with all these things > should be > doable for sure, but would require someone with real good expertise > with > docker as well as setting up these services with Hive. Obviously, I am > >>> not > that guy :-) > > @Simhadri has a PR which publishes docker images once a release tag is > pushed, you can explore to have similar stuff for the Snapshot > version, > maybe if that sounds cool > > -Ayush > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > wrote: > > > Hi Vihang, > > > > +1 on the idea. > > > > This is a great idea to quickly test if a certain feature is working > as > > expected on a certain branch. > > This way we test data loss, correctness, or any other unexpected > scenarios > > that are Hive specific only. However, I'm wondering if it is possible > >>> to > > deploy/test in a kerberized environment or issues involving > >>>
Re: [DISCUSS] Nightly snaphot builds
Hey, > We already have nightly builds for Hive [1]. > [1] http://ci.hive.apache.org/job/hive-nightly/ ...and hive-dev-box can launch such archives; either by using it like this: https://www.mail-archive.com/dev@hive.apache.org/msg142420.html or with a somewhat longer command you could launch hdb in bazaar mode; and have an HS2 running with a nightly version: docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e HIVE_VERSION=http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz --name hive kgyrtkirk/hive-dev-box:bazaar cheers, Zoltan On 5/24/23 09:15, Stamatis Zampetakis wrote: Hey all, We already have nightly builds for Hive [1]. Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar wrote: I think there are many benefits like others in this thread suggested which can be built on top of nightly builds. Having docker images is great but for now I think we can start simple and publish the jars. Many users still just deploy using jars and it would be useful to them. Once we have a docker environment we can add a docker image too to the nightly builds so that users can choose their preferred way. On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: I think such nightly builds will be useful for testing and debugging in the future. I also wonder if we can somehow create builds even from previous commits (e.g., for the past few years). Such builds from previous commits don't have to be daily builds, and I think weekly builds (or even monthly builds) would also be very useful. The reason I wish such builds were available is to facilitate debugging and testing. When tested against the TPC-DS benchmark, the current master branch has several correctness problems that were introduced after the release of Hive 3.1.2. We have reported all problems known to us in [1] and also submitted several patches. If such nightly builds had been available, we would have saved quite a bit of time for implementing the patches by quickly finding offending commits that introduced new correctness bugs. In addition, you can find quite a few commits in the master branch that report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, HIVE-25170, HIVE-25864, HIVE-26671. (There may be some errors in this list because we compared against Hive 3.1.2 with many patches backported.) Such nightly builds can be useful for finding root causes of such bugs. Ideally I wish there was an automated procedure to create nightly builds, run TPC-DS benchmark, and report correctness/performance results, although this would be quite hard to implement. (I remember Spark implemented this procedure in the era of Spark 2, but my memory could be wrong.) [1] https://issues.apache.org/jira/browse/HIVE-26654 On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: Hi Vihang, +1, We were even exploring publishing the docker images of the snapshot version as well per commit or maybe weekly, so just shoot 2 docker commands and you get a Hive cluster running with master code. Sai, I think to spin up an env via Docker with all these things should be doable for sure, but would require someone with real good expertise with docker as well as setting up these services with Hive. Obviously, I am not that guy :-) @Simhadri has a PR which publishes docker images once a release tag is pushed, you can explore to have similar stuff for the Snapshot version, maybe if that sounds cool -Ayush On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala wrote: Hi Vihang, +1 on the idea. This is a great idea to quickly test if a certain feature is working as expected on a certain branch. This way we test data loss, correctness, or any other unexpected scenarios that are Hive specific only. However, I'm wondering if it is possible to deploy/test in a kerberized environment or issues involving authorization services like sentry/ranger. Thanks, Sai. On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < vihan...@apache.org> wrote: Hello Team, I have observed that it is a common use-case where users would like to test out unreleased features/bug fixes either to unblock them or test out if the bug fixes really work as intended in their environments. Today in the case of Apache Hive, this is not very user friendly because it requires the end user to build the binaries directly from the hive source code. I found that Apache Spark has a very useful infrastructure [1] which deploys nightly snapshots [2] [3] from the branch using github actions. This is super useful for any user who wants to try out the latest and greatest using the nightly builds. I was wondering if we should also adopt this. We can use
Re: Apache Hive on Twitter
Thanks for driving this Ayush! It's great to see Hive alive again on twitter. Best, Stamatis On Tue, May 23, 2023 at 3:58 AM Ayush Saxena wrote: > > Hi All, > I am happy to announce: We have got the Apache Hive Twitter account active > again or maybe in other words we have got creds to use it now. > > The twitter account stays here: > > https://twitter.com/ApacheHive > > The account belongs to all of us at Hive. As we decided, if anyone wants to > get anything posted on the Twitter account, related to Apache Hive. He/She > can drop a mail to the Hive Dev mailing with the request, with a label in > the subject [Twitter]. > > For the record as of today, following people have access to post: > > Alan Gates, Ayush Saxena, Carl Steinbach, Joydeep Sen Sharma, Owen > O'Malley, Sushanth Sowmyan, Szehon Ho, Thejas Nair & Vikram Dixit > > A note of thanks to Joydeep Sen Sharma, Carl Steinbach, Stamatis Zampetakis > & Naveen Gangam for helping with the process. Attila Turoczy for the > initial thoughts/idea. > > -Ayush
Re: [DISCUSS] Nightly snaphot builds
Hey all, We already have nightly builds for Hive [1]. Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar wrote: > > I think there are many benefits like others in this thread suggested which > can be built on top of nightly builds. Having docker images is great but > for now I think we can start simple and publish the jars. Many users still > just deploy using jars and it would be useful to them. Once we have a > docker environment we can add a docker image too to the nightly builds so > that users can choose their preferred way. > > On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: > > > I think such nightly builds will be useful for testing and debugging in the > > future. > > > > I also wonder if we can somehow create builds even from previous commits > > (e.g., for the past few years). Such builds from previous commits don't > > have to be daily builds, and I think weekly builds (or even monthly builds) > > would also be very useful. > > > > The reason I wish such builds were available is to facilitate debugging and > > testing. When tested against the TPC-DS benchmark, the current master > > branch has several correctness problems that were introduced after the > > release of Hive 3.1.2. We have reported all problems known to us in [1] and > > also submitted several patches. If such nightly builds had been available, > > we would have saved quite a bit of time for implementing the patches by > > quickly finding offending commits that introduced new correctness bugs. > > > > In addition, you can find quite a few commits in the master branch that > > report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, > > HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > > HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > > HIVE-25170, HIVE-25864, HIVE-26671. > > (There may be some errors in this list because we compared against Hive > > 3.1.2 with many patches backported.) Such nightly builds can be useful for > > finding root causes of such bugs. > > > > Ideally I wish there was an automated procedure to create nightly builds, > > run TPC-DS benchmark, and report correctness/performance results, although > > this would be quite hard to implement. (I remember Spark implemented this > > procedure in the era of Spark 2, but my memory could be wrong.) > > > > [1] https://issues.apache.org/jira/browse/HIVE-26654 > > > > > > On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: > > > > > Hi Vihang, > > > +1, We were even exploring publishing the docker images of the snapshot > > > version as well per commit or maybe weekly, so just shoot 2 docker > > commands > > > and you get a Hive cluster running with master code. > > > > > > Sai, I think to spin up an env via Docker with all these things should be > > > doable for sure, but would require someone with real good expertise with > > > docker as well as setting up these services with Hive. Obviously, I am > > not > > > that guy :-) > > > > > > @Simhadri has a PR which publishes docker images once a release tag is > > > pushed, you can explore to have similar stuff for the Snapshot version, > > > maybe if that sounds cool > > > > > > -Ayush > > > > > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > > > wrote: > > > > > > > Hi Vihang, > > > > > > > > +1 on the idea. > > > > > > > > This is a great idea to quickly test if a certain feature is working as > > > > expected on a certain branch. > > > > This way we test data loss, correctness, or any other unexpected > > > scenarios > > > > that are Hive specific only. However, I'm wondering if it is possible > > to > > > > deploy/test in a kerberized environment or issues involving > > authorization > > > > services like sentry/ranger. > > > > > > > > Thanks, > > > > Sai. > > > > > > > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < > > > vihan...@apache.org> > > > > wrote: > > > > > > > > > Hello Team, > > > > > > > > > > I have observed that it is a common use-case where users would like > > to > > > > test > > > > > out unreleased features/bug fixes either to unblock them or test out > > if > > > > the > > > > > bug fixes really work as intended in their environments. Today in the > > > > case > > > > > of Apache Hive, this is not very user friendly because it requires > > the > > > > end > > > > > user to build the binaries directly from the hive source code. > > > > > > > > > > I found that Apache Spark has a very useful infrastructure [1] which > > > > > deploys nightly snapshots [2] [3] from the branch using github > > actions. > > > > > This is super useful for any user who wants to try out the latest and > > > > > greatest using the nightly builds. > > > > > > > > > > I was wondering if we should also adopt this. We can use github > > actions > > > > to > > > > > upload the snapshot jars to the public repository (e.g github > > packages) > > > > and