Hey,

> We already have nightly builds for Hive [1].
> [1] http://ci.hive.apache.org/job/hive-nightly/

...and hive-dev-box can launch such archives; either by using it like this:
https://www.mail-archive.com/dev@hive.apache.org/msg142420.html

or with a somewhat longer command you could launch hdb in bazaar mode; and have 
an HS2 running with a nightly version:

docker run --rm -d -p 10000:10000 -v hive-dev-box_work:/work -e HIVE_VERSION=http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz --name hive kgyrtkirk/hive-dev-box:bazaar

cheers,
Zoltan

On 5/24/23 09:15, Stamatis Zampetakis wrote:
Hey all,

We already have nightly builds for Hive [1].

Do we need something more than that?

Best,
Stamatis

[1] http://ci.hive.apache.org/job/hive-nightly/


On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar <vihan...@apache.org> wrote:

I think there are many benefits like others in this thread suggested which
can be built on top of nightly builds. Having docker images is great but
for now I think we can start simple and publish the jars. Many users still
just deploy using jars and it would be useful to them. Once we have a
docker environment we can add a docker image too to the nightly builds so
that users can choose their preferred way.

On Mon, May 22, 2023 at 11:07 PM Sungwoo Park <glap...@gmail.com> wrote:

I think such nightly builds will be useful for testing and debugging in the
future.

I also wonder if we can somehow create builds even from previous commits
(e.g., for the past few years). Such builds from previous commits don't
have to be daily builds, and I think weekly builds (or even monthly builds)
would also be very useful.

The reason I wish such builds were available is to facilitate debugging and
testing. When tested against the TPC-DS benchmark, the current master
branch has several correctness problems that were introduced after the
release of Hive 3.1.2. We have reported all problems known to us in [1] and
also submitted several patches. If such nightly builds had been available,
we would have saved quite a bit of time for implementing the patches by
quickly finding offending commits that introduced new correctness bugs.

In addition, you can find quite a few commits in the master branch that
report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990,
HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114,
HIVE-22227, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777,
HIVE-25170, HIVE-25864, HIVE-26671.
(There may be some errors in this list because we compared against Hive
3.1.2 with many patches backported.) Such nightly builds can be useful for
finding root causes of such bugs.

Ideally I wish there was an automated procedure to create nightly builds,
run TPC-DS benchmark, and report correctness/performance results, although
this would be quite hard to implement. (I remember Spark implemented this
procedure in the era of Spark 2, but my memory could be wrong.)

[1] https://issues.apache.org/jira/browse/HIVE-26654


On Tue, May 23, 2023 at 10:44 AM Ayush Saxena <ayush...@gmail.com> wrote:

Hi Vihang,
+1, We were even exploring publishing the docker images of the snapshot
version as well per commit or maybe weekly, so just shoot 2 docker
commands
and you get a Hive cluster running with master code.

Sai, I think to spin up an env via Docker with all these things should be
doable for sure, but would require someone with real good expertise with
docker as well as setting up these services with Hive. Obviously, I am
not
that guy :-)

@Simhadri has a PR which publishes docker images once a release tag is
pushed, you can explore to have similar stuff for the Snapshot version,
maybe if that sounds cool

-Ayush

On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala
<saihema...@cloudera.com.invalid> wrote:

Hi Vihang,

+1 on the idea.

This is a great idea to quickly test if a certain feature is working as
expected on a certain branch.
This way we test data loss, correctness, or any other unexpected
scenarios
that are Hive specific only. However, I'm wondering if it is possible
to
deploy/test in a kerberized environment or issues involving
authorization
services like sentry/ranger.

Thanks,
Sai.

On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar <
vihan...@apache.org>
wrote:

Hello Team,

I have observed that it is a common use-case where users would like
to
test
out unreleased features/bug fixes either to unblock them or test out
if
the
bug fixes really work as intended in their environments. Today in the
case
of Apache Hive, this is not very user friendly because it requires
the
end
user to build the binaries directly from the hive source code.

I found that Apache Spark has a very useful infrastructure [1] which
deploys nightly snapshots [2] [3] from the branch using github
actions.
This is super useful for any user who wants to try out the latest and
greatest using the nightly builds.

I was wondering if we should also adopt this. We can use github
actions
to
upload the snapshot jars to the public repository (e.g github
packages)
and
schedule it as a nightly job.

[1] https://issues.apache.org/jira/browse/INFRA-21167
[2]
https://github.com/apache/spark/pkgs/container/apache-spark-ci-image
[3] https://github.com/apache/spark/pull/30623

I can take a stab at this if the community thinks that this is a nice
thing
to have.

Thanks,
Vihang




Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to