Re: [EXTERNAL] Re: [VOTE] Mark Hive 1.x EOL

2024-01-17 Thread Vihang Karajgaonkar
+1

On Wed, Jan 17, 2024 at 3:48 PM Larry Z.  wrote:

> +1 (binding)
>
> On Wed, Jan 17, 2024 at 9:46 AM Chao Sun  wrote:
>
> > +1 (binding)
> >
> > On Wed, Jan 17, 2024 at 1:24 AM Alessandro Solimando
> >  wrote:
> > >
> > > +1 (non binding)
> > >
> > > On Wed, 17 Jan 2024 at 10:23, Denys Kuzmenko 
> > wrote:
> > >
> > > > +1 (binding)
> > > >
> >
>
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>


Re: [EXTERNAL] Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x

2024-01-16 Thread vihang karajgaonkar
I was confused about the subject line since it says 3.x as well along with
1.x and 2.x. Does this discussion include all 1.x, 2.x and 3.x or just 1.x
and 2.x?

I think it makes sense to EOL 1.x. Looks like 2.x is still being maintained
by Chao and I think we were backporting PRs to the 3.x line pretty recently
so I believe we should wait out for a release on Hive 3.x.

Thanks,
Vihang

On Tue, Jan 16, 2024 at 3:40 PM Attila Turoczy
 wrote:

> Dear PMC's,
>
> Do we have a verdict / decision about this?
>
> -Attila
>
> On Wed, Jan 10, 2024 at 5:45 PM Chao Sun  wrote:
>
> > On Hive 2.x, I'm still preparing for another release 2.3.10 (Hive 2.3
> > branch is being actively maintained so far). Hopefully this will be
> > the last release in the branch-2 line.
> >
> > +1 on making Hive 1 EOL for the time being.
> >
> > Chao
> >
> > On Wed, Jan 10, 2024 at 8:10 AM Sankar Hariappan
> >  wrote:
> > >
> > > +1 for making both Hive 1&2 EOL
> > >
> > > -Sankar
> > > -Original Message-
> > > From: Attila Turoczy 
> > > Sent: Wednesday, January 10, 2024 7:37 PM
> > > To: dev@hive.apache.org
> > > Subject: [EXTERNAL] Re: [DISCUSS] End of life for Hive 1.x, 2.x, 3.x
> > >
> > > [You don't often get email from aturo...@cloudera.com.invalid. Learn
> > why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> > >
> > > +1 for making it EOL for Hive 1 and Hive 2. I do not think these 2
> > product
> > > branches are relevant in 2023.
> > >
> > > -Attila
> > >
> > > On Wed, Jan 10, 2024 at 12:59 PM Denys Kuzmenko 
> > > wrote:
> > >
> > > > +1 for marking Hive 1.x EOL
> > > >
> > > > Assuming no volunteers willing to take ownership of branch-2
> > maintenance,
> > > > +1 to declare it EOL as well.
> > > >
> > > > Regards,
> > > > Denys
> > > >
> >
>


Hive 4.0 branching

2023-06-07 Thread vihang karajgaonkar
Hello Dev community,

I noticed that we have a branch-4.0.0-alpha1 and branch-4.0.0-alpha-2 but I
don't see a branch-4. Does anyone know which branch would be used to cut
out the final 4.0.0 release and when would a branch-4 be available?

Thanks,
Vihang


Re: Re: Reg: Discussion on removal of deprecated APIs in the HMS thrift interface

2023-06-07 Thread vihang karajgaonkar
+1 on removing unreleased and deprecated APIs. For the APIs which have been
released, we should mark them deprecated first and remove them only in the
next major release. As many of you know, Hive metastore APIs are used by
many other systems outside Hive too and we should be careful not make
breaking changes without giving reasonable notice.

On Thu, Jun 1, 2023 at 4:29 AM Stamatis Zampetakis 
wrote:

> Zhihua brought up a good point. Yes if it was introduced in
> 4.0.0-alpha and then was deprecated we can remove it.
>
> On Thu, Jun 1, 2023 at 1:00 PM Attila Turoczy
>  wrote:
> >
> > +1 from me as well. Let's clean it up. Still, because we have struggled
> > with the data correctness issue, we have time to introduce these changes.
> > If won't fit then won't be a problem as well, as the next release will
> > contain it. As I wrote earlier, as the 4.0 goes out I want to help to
> have
> > regular releases. Even majors. I have started a proposal document about a
> > public hive roadmap, and release roadmap that I want to share and discuss
> > with the community.
> >
> > -Attila
> >
> > On Thu, Jun 1, 2023 at 12:37 PM dengzhhu653  wrote:
> >
> > > Hi
> > >
> > >
> > > Thanks Sai for driving this, the request based API makes sense to me.
> > > For the removal of deprecated API:
> > >  a) +1 if it is marked as deprecated in 3.x;
> > >  b) If the API is introduced after 4.0.0-alpha, but tend to become
> > > obsolete in 4.x GA, I think we can remove it as well.
> > >
> > >
> > > Thanks,
> > > Zhihua.
> > > At 2023-06-01 17:56:03, "Ayush Saxena"  wrote:
> > > >+1 to what Stamatis said, if it is there in 3.X we can explore their
> > > removal, else let them go in 4.x GA release and we can remove then in
> the
> > > subsequent release
> > > >
> > > >-Ayush
> > > >
> > > >> On 01-Jun-2023, at 3:08 PM, Stamatis Zampetakis 
> > > wrote:
> > > >>
> > > >> Hello,
> > > >>
> > > >> Ideally we should deprecate APIs in one release and remove them in a
> > > >> subsequent major release. If the HMS deprecations were added in Hive
> > > >> 3.X then I am ok removing them now. Otherwise it is not really that
> we
> > > >> will remove deprecated APIs but we will remove regular APIs without
> > > >> any notice.
> > > >>
> > > >> Best,
> > > >> Stamatis
> > > >>
> > > >>> On Thu, Jun 1, 2023 at 2:57 AM Sai Hemanth Gantasala
> > > >>>  wrote:
> > > >>>
> > > >>> Hi everyone,
> > > >>>
> > > >>> This thread is to initiate a discussion on the removal of
> deprecated
> > > APIs
> > > >>> in the HMS thrift class. Any client including HiveMetastoreClient
> > > talks to
> > > >>> HiveMetaStore Server via the thrift layer. Over the past few
> years, the
> > > >>> thrift class is bloated with duplicated APIs with varying
> parameters
> > > >>> (function overloading) in the API definition. The reason why the
> APIs
> > > are
> > > >>> being deprecated is that the API might need an additional
> argument, so
> > > a
> > > >>> new API is added with an additional argument, and mark the old API
> as
> > > >>> deprecated.
> > > >>>
> > > >>> I'm working on HIVE-26537 <
> > > https://issues.apache.org/jira/browse/HIVE-26537>
> > > >>> to clean up the code around the interaction between
> > > HiveMetaStoreClient and
> > > >>> HMS to not use the deprecated APIs (the HMS client will now be
> using
> > > >>> request-based APIs instead of APIs using individual arguments).
> Going
> > > >>> forward, using these request-based APIs is ideal as we can just
> add an
> > > >>> additional field to request object definition in the thrift class
> and
> > > API
> > > >>> remains unchanged. This would hopefully require minimal changes
> between
> > > >>> client and server interaction in the future.
> > > >>>
> > > >>> I would like to hear the community member's opinions regarding the
> > > >>> deprecated APIs,
> > > >>> 1) Keep the deprecated APIs for the 4.x release, HMSClient will
> use the
> > > >>> request-based APIs, So that would keep the older clients compatible
> > > with
> > > >>> the new HMS server.
> > > >>> 2) Remove the deprecated APIs for the 4.x release. This would break
> > > >>> backward compatibility with the older clients but we have the
> > > opportunity
> > > >>> to clean up a lot of deprecated code. Since we are making a major
> > > release
> > > >>> after 5 years, I hope this incompatibility is acceptable.
> > > >>>
> > > >>> Please let me know your thoughts.
> > > >>>
> > > >>> Thanks,
> > > >>> Sai.
> > >
>


Re: [DISCUSS] Nightly snaphot builds

2023-05-26 Thread vihang karajgaonkar
Thanks Zoltan. Makes sense. Also, we should definitely strive to release
within 180 days especially when there are lots of commits to a branch.

-Vihang

On Fri, May 26, 2023 at 12:04 AM Zoltan Haindrich  wrote:

> On 5/25/23 19:58, vihang karajgaonkar wrote:
> > I just tried the job and it worked as expected. Thanks! If I understand
> > correctly, the job retains builds for 180 days. Does it mean if there
> were
> > no commits to a branch for more than 180 days, we will lose the build
> > artifacts eventually?
>
> not entirely - the removal of old builds is a post-build action; which
> means - if there are no builds; the removal logic will never run
> https://plugins.jenkins.io/discard-old-build/
>
> on the other hand I wonder how much value a nightly build can still
> provide after 180 days :)
> preferably - a real release should be done after some time :)
>
> cheers,
> Zoltan
>
> >
> > On Thu, May 25, 2023 at 1:50 AM Zoltan Haindrich  wrote:
> >
> >> Hey Vihang,
> >>
> >> I've added you as an admin; and I've copied the job as
> >> http://ci.hive.apache.org/job/hive-nightly-branch-3/
> >> other option could be to trigger the original job or use
> >> parameterized-scheduler  but that would configure a real unconditional
> >> nightly build - which will just build the
> >> same version over-and-over again if there are no changes...
> >> ...the current nighly is SCM triggered ; but only once-a-day it makes a
> >> check which creates the desired results.
> >>
> >> the least painfull was to copy the job; I guess no-one touched the
> >> pipeline script ever since it was introduced :D
> >>
> >> cheers,
> >> Zoltan
> >>
> >> On 5/25/23 01:26, vihang karajgaonkar wrote:
> >>> I created https://issues.apache.org/jira/browse/HIVE-27371 to have
> >> nightly
> >>> builds for branch-3. Once that is merged, I think we can have scheduled
> >>> builds for branch-3 as well. Although, I don't have permissions to
> >> create a
> >>> new job for branch-3. Does anyone know how to do it?
> >>>
> >>> Thanks,
> >>> Vihang
> >>>
> >>> On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar <
> >> vihan...@apache.org>
> >>> wrote:
> >>>
> >>>> The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great.
> >> Can
> >>>> we have this for branch-3 as well since we have been backporting a lot
> >> of
> >>>> PRs to branch-3 lately.
> >>>>
> >>>> Thanks,
> >>>> Vihang
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich  wrote:
> >>>>
> >>>>> Hey,
> >>>>>
> >>>>>> We already have nightly builds for Hive [1].
> >>>>>> [1] http://ci.hive.apache.org/job/hive-nightly/
> >>>>>
> >>>>> ...and hive-dev-box can launch such archives; either by using it like
> >>>>> this:
> >>>>> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html
> >>>>>
> >>>>> or with a somewhat longer command you could launch hdb in bazaar
> mode;
> >>>>> and have an HS2 running with a nightly version:
> >>>>>
> >>>>> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e
> >>>>> HIVE_VERSION=
> >>>>>
> >>
> http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz
> >>>>> --name hive
> >>>>> kgyrtkirk/hive-dev-box:bazaar
> >>>>>
> >>>>> cheers,
> >>>>> Zoltan
> >>>>>
> >>>>> On 5/24/23 09:15, Stamatis Zampetakis wrote:
> >>>>>> Hey all,
> >>>>>>
> >>>>>> We already have nightly builds for Hive [1].
> >>>>>>
> >>>>>> Do we need something more than that?
> >>>>>>
> >>>>>> Best,
> >>>>>> Stamatis
> >>>>>>
> >>>>>> [1] http://ci.hive.apache.org/job/hive-nightly/
> >>>>>>
> >>>>>>
> >>>>>> On Tue, May 23, 2023 at 9:03 

Re: [DISCUSS] Nightly snaphot builds

2023-05-25 Thread vihang karajgaonkar
I just tried the job and it worked as expected. Thanks! If I understand
correctly, the job retains builds for 180 days. Does it mean if there were
no commits to a branch for more than 180 days, we will lose the build
artifacts eventually?

On Thu, May 25, 2023 at 1:50 AM Zoltan Haindrich  wrote:

> Hey Vihang,
>
> I've added you as an admin; and I've copied the job as
> http://ci.hive.apache.org/job/hive-nightly-branch-3/
> other option could be to trigger the original job or use
> parameterized-scheduler  but that would configure a real unconditional
> nightly build - which will just build the
> same version over-and-over again if there are no changes...
> ...the current nighly is SCM triggered ; but only once-a-day it makes a
> check which creates the desired results.
>
> the least painfull was to copy the job; I guess no-one touched the
> pipeline script ever since it was introduced :D
>
> cheers,
> Zoltan
>
> On 5/25/23 01:26, vihang karajgaonkar wrote:
> > I created https://issues.apache.org/jira/browse/HIVE-27371 to have
> nightly
> > builds for branch-3. Once that is merged, I think we can have scheduled
> > builds for branch-3 as well. Although, I don't have permissions to
> create a
> > new job for branch-3. Does anyone know how to do it?
> >
> > Thanks,
> > Vihang
> >
> > On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar <
> vihan...@apache.org>
> > wrote:
> >
> >> The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great.
> Can
> >> we have this for branch-3 as well since we have been backporting a lot
> of
> >> PRs to branch-3 lately.
> >>
> >> Thanks,
> >> Vihang
> >>
> >>
> >>
> >>
> >>
> >> On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich  wrote:
> >>
> >>> Hey,
> >>>
> >>>   > We already have nightly builds for Hive [1].
> >>>   > [1] http://ci.hive.apache.org/job/hive-nightly/
> >>>
> >>> ...and hive-dev-box can launch such archives; either by using it like
> >>> this:
> >>> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html
> >>>
> >>> or with a somewhat longer command you could launch hdb in bazaar mode;
> >>> and have an HS2 running with a nightly version:
> >>>
> >>> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e
> >>> HIVE_VERSION=
> >>>
> http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz
> >>> --name hive
> >>> kgyrtkirk/hive-dev-box:bazaar
> >>>
> >>> cheers,
> >>> Zoltan
> >>>
> >>> On 5/24/23 09:15, Stamatis Zampetakis wrote:
> >>>> Hey all,
> >>>>
> >>>> We already have nightly builds for Hive [1].
> >>>>
> >>>> Do we need something more than that?
> >>>>
> >>>> Best,
> >>>> Stamatis
> >>>>
> >>>> [1] http://ci.hive.apache.org/job/hive-nightly/
> >>>>
> >>>>
> >>>> On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar <
> >>> vihan...@apache.org> wrote:
> >>>>>
> >>>>> I think there are many benefits like others in this thread suggested
> >>> which
> >>>>> can be built on top of nightly builds. Having docker images is great
> >>> but
> >>>>> for now I think we can start simple and publish the jars. Many users
> >>> still
> >>>>> just deploy using jars and it would be useful to them. Once we have a
> >>>>> docker environment we can add a docker image too to the nightly
> builds
> >>> so
> >>>>> that users can choose their preferred way.
> >>>>>
> >>>>> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park 
> >>> wrote:
> >>>>>
> >>>>>> I think such nightly builds will be useful for testing and debugging
> >>> in the
> >>>>>> future.
> >>>>>>
> >>>>>> I also wonder if we can somehow create builds even from previous
> >>> commits
> >>>>>> (e.g., for the past few years). Such builds from previous commits
> >>> don't
> >>>>>> have to be daily builds, and I think weekly builds (or even monthly
> >>> builds)
> >>

Re: [DISCUSS] Nightly snaphot builds

2023-05-24 Thread vihang karajgaonkar
I created https://issues.apache.org/jira/browse/HIVE-27371 to have nightly
builds for branch-3. Once that is merged, I think we can have scheduled
builds for branch-3 as well. Although, I don't have permissions to create a
new job for branch-3. Does anyone know how to do it?

Thanks,
Vihang

On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar 
wrote:

> The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can
> we have this for branch-3 as well since we have been backporting a lot of
> PRs to branch-3 lately.
>
> Thanks,
> Vihang
>
>
>
>
>
> On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich  wrote:
>
>> Hey,
>>
>>  > We already have nightly builds for Hive [1].
>>  > [1] http://ci.hive.apache.org/job/hive-nightly/
>>
>> ...and hive-dev-box can launch such archives; either by using it like
>> this:
>> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html
>>
>> or with a somewhat longer command you could launch hdb in bazaar mode;
>> and have an HS2 running with a nightly version:
>>
>> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e
>> HIVE_VERSION=
>> http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz
>> --name hive
>> kgyrtkirk/hive-dev-box:bazaar
>>
>> cheers,
>> Zoltan
>>
>> On 5/24/23 09:15, Stamatis Zampetakis wrote:
>> > Hey all,
>> >
>> > We already have nightly builds for Hive [1].
>> >
>> > Do we need something more than that?
>> >
>> > Best,
>> > Stamatis
>> >
>> > [1] http://ci.hive.apache.org/job/hive-nightly/
>> >
>> >
>> > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar <
>> vihan...@apache.org> wrote:
>> >>
>> >> I think there are many benefits like others in this thread suggested
>> which
>> >> can be built on top of nightly builds. Having docker images is great
>> but
>> >> for now I think we can start simple and publish the jars. Many users
>> still
>> >> just deploy using jars and it would be useful to them. Once we have a
>> >> docker environment we can add a docker image too to the nightly builds
>> so
>> >> that users can choose their preferred way.
>> >>
>> >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park 
>> wrote:
>> >>
>> >>> I think such nightly builds will be useful for testing and debugging
>> in the
>> >>> future.
>> >>>
>> >>> I also wonder if we can somehow create builds even from previous
>> commits
>> >>> (e.g., for the past few years). Such builds from previous commits
>> don't
>> >>> have to be daily builds, and I think weekly builds (or even monthly
>> builds)
>> >>> would also be very useful.
>> >>>
>> >>> The reason I wish such builds were available is to facilitate
>> debugging and
>> >>> testing. When tested against the TPC-DS benchmark, the current master
>> >>> branch has several correctness problems that were introduced after the
>> >>> release of Hive 3.1.2. We have reported all problems known to us in
>> [1] and
>> >>> also submitted several patches. If such nightly builds had been
>> available,
>> >>> we would have saved quite a bit of time for implementing the patches
>> by
>> >>> quickly finding offending commits that introduced new correctness
>> bugs.
>> >>>
>> >>> In addition, you can find quite a few commits in the master branch
>> that
>> >>> report bugs which are not reproduced in Hive 3.1.2. Examples:
>> HIVE-19990,
>> >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114,
>> >>> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777,
>> >>> HIVE-25170, HIVE-25864, HIVE-26671.
>> >>> (There may be some errors in this list because we compared against
>> Hive
>> >>> 3.1.2 with many patches backported.) Such nightly builds can be
>> useful for
>> >>> finding root causes of such bugs.
>> >>>
>> >>> Ideally I wish there was an automated procedure to create nightly
>> builds,
>> >>> run TPC-DS benchmark, and report correctness/performance results,
>> although
>> >>> this would be quite hard to implement. (I remember Spark implemented
>> this
&g

Re: [DISCUSS] Nightly snaphot builds

2023-05-24 Thread vihang karajgaonkar
The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can
we have this for branch-3 as well since we have been backporting a lot of
PRs to branch-3 lately.

Thanks,
Vihang





On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich  wrote:

> Hey,
>
>  > We already have nightly builds for Hive [1].
>  > [1] http://ci.hive.apache.org/job/hive-nightly/
>
> ...and hive-dev-box can launch such archives; either by using it like this:
> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html
>
> or with a somewhat longer command you could launch hdb in bazaar mode; and
> have an HS2 running with a nightly version:
>
> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e
> HIVE_VERSION=
> http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz
> --name hive
> kgyrtkirk/hive-dev-box:bazaar
>
> cheers,
> Zoltan
>
> On 5/24/23 09:15, Stamatis Zampetakis wrote:
> > Hey all,
> >
> > We already have nightly builds for Hive [1].
> >
> > Do we need something more than that?
> >
> > Best,
> > Stamatis
> >
> > [1] http://ci.hive.apache.org/job/hive-nightly/
> >
> >
> > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar 
> wrote:
> >>
> >> I think there are many benefits like others in this thread suggested
> which
> >> can be built on top of nightly builds. Having docker images is great but
> >> for now I think we can start simple and publish the jars. Many users
> still
> >> just deploy using jars and it would be useful to them. Once we have a
> >> docker environment we can add a docker image too to the nightly builds
> so
> >> that users can choose their preferred way.
> >>
> >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park 
> wrote:
> >>
> >>> I think such nightly builds will be useful for testing and debugging
> in the
> >>> future.
> >>>
> >>> I also wonder if we can somehow create builds even from previous
> commits
> >>> (e.g., for the past few years). Such builds from previous commits don't
> >>> have to be daily builds, and I think weekly builds (or even monthly
> builds)
> >>> would also be very useful.
> >>>
> >>> The reason I wish such builds were available is to facilitate
> debugging and
> >>> testing. When tested against the TPC-DS benchmark, the current master
> >>> branch has several correctness problems that were introduced after the
> >>> release of Hive 3.1.2. We have reported all problems known to us in
> [1] and
> >>> also submitted several patches. If such nightly builds had been
> available,
> >>> we would have saved quite a bit of time for implementing the patches by
> >>> quickly finding offending commits that introduced new correctness bugs.
> >>>
> >>> In addition, you can find quite a few commits in the master branch that
> >>> report bugs which are not reproduced in Hive 3.1.2. Examples:
> HIVE-19990,
> >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114,
> >>> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777,
> >>> HIVE-25170, HIVE-25864, HIVE-26671.
> >>> (There may be some errors in this list because we compared against Hive
> >>> 3.1.2 with many patches backported.) Such nightly builds can be useful
> for
> >>> finding root causes of such bugs.
> >>>
> >>> Ideally I wish there was an automated procedure to create nightly
> builds,
> >>> run TPC-DS benchmark, and report correctness/performance results,
> although
> >>> this would be quite hard to implement. (I remember Spark implemented
> this
> >>> procedure in the era of Spark 2, but my memory could be wrong.)
> >>>
> >>> [1] https://issues.apache.org/jira/browse/HIVE-26654
> >>>
> >>>
> >>> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena 
> wrote:
> >>>
> >>>> Hi Vihang,
> >>>> +1, We were even exploring publishing the docker images of the
> snapshot
> >>>> version as well per commit or maybe weekly, so just shoot 2 docker
> >>> commands
> >>>> and you get a Hive cluster running with master code.
> >>>>
> >>>> Sai, I think to spin up an env via Docker with all these things
> should be
> >>>> doable for sure, but would require someone with real good expertise
> with
> 

Re: [DISCUSS] Nightly snaphot builds

2023-05-23 Thread vihang karajgaonkar
I think there are many benefits like others in this thread suggested which
can be built on top of nightly builds. Having docker images is great but
for now I think we can start simple and publish the jars. Many users still
just deploy using jars and it would be useful to them. Once we have a
docker environment we can add a docker image too to the nightly builds so
that users can choose their preferred way.

On Mon, May 22, 2023 at 11:07 PM Sungwoo Park  wrote:

> I think such nightly builds will be useful for testing and debugging in the
> future.
>
> I also wonder if we can somehow create builds even from previous commits
> (e.g., for the past few years). Such builds from previous commits don't
> have to be daily builds, and I think weekly builds (or even monthly builds)
> would also be very useful.
>
> The reason I wish such builds were available is to facilitate debugging and
> testing. When tested against the TPC-DS benchmark, the current master
> branch has several correctness problems that were introduced after the
> release of Hive 3.1.2. We have reported all problems known to us in [1] and
> also submitted several patches. If such nightly builds had been available,
> we would have saved quite a bit of time for implementing the patches by
> quickly finding offending commits that introduced new correctness bugs.
>
> In addition, you can find quite a few commits in the master branch that
> report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990,
> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114,
> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777,
> HIVE-25170, HIVE-25864, HIVE-26671.
> (There may be some errors in this list because we compared against Hive
> 3.1.2 with many patches backported.) Such nightly builds can be useful for
> finding root causes of such bugs.
>
> Ideally I wish there was an automated procedure to create nightly builds,
> run TPC-DS benchmark, and report correctness/performance results, although
> this would be quite hard to implement. (I remember Spark implemented this
> procedure in the era of Spark 2, but my memory could be wrong.)
>
> [1] https://issues.apache.org/jira/browse/HIVE-26654
>
>
> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena  wrote:
>
> > Hi Vihang,
> > +1, We were even exploring publishing the docker images of the snapshot
> > version as well per commit or maybe weekly, so just shoot 2 docker
> commands
> > and you get a Hive cluster running with master code.
> >
> > Sai, I think to spin up an env via Docker with all these things should be
> > doable for sure, but would require someone with real good expertise with
> > docker as well as setting up these services with Hive. Obviously, I am
> not
> > that guy :-)
> >
> > @Simhadri has a PR which publishes docker images once a release tag is
> > pushed, you can explore to have similar stuff for the Snapshot version,
> > maybe if that sounds cool
> >
> > -Ayush
> >
> > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala
> >  wrote:
> >
> > > Hi Vihang,
> > >
> > > +1 on the idea.
> > >
> > > This is a great idea to quickly test if a certain feature is working as
> > > expected on a certain branch.
> > > This way we test data loss, correctness, or any other unexpected
> > scenarios
> > > that are Hive specific only. However, I'm wondering if it is possible
> to
> > > deploy/test in a kerberized environment or issues involving
> authorization
> > > services like sentry/ranger.
> > >
> > > Thanks,
> > > Sai.
> > >
> > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar <
> > vihan...@apache.org>
> > > wrote:
> > >
> > > > Hello Team,
> > > >
> > > > I have observed that it is a common use-case where users would like
> to
> > > test
> > > > out unreleased features/bug fixes either to unblock them or test out
> if
> > > the
> > > > bug fixes really work as intended in their environments. Today in the
> > > case
> > > > of Apache Hive, this is not very user friendly because it requires
> the
> > > end
> > > > user to build the binaries directly from the hive source code.
> > > >
> > > > I found that Apache Spark has a very useful infrastructure [1] which
> > > > deploys nightly snapshots [2] [3] from the branch using github
> actions.
> > > > This is super useful for any user who wants to try out the latest and
> > > > greatest using the nightly builds.
> > > >
> > > > I was wondering if we should also adopt this. We can use github
> actions
> > > to
> > > > upload the snapshot jars to the public repository (e.g github
> packages)
> > > and
> > > > schedule it as a nightly job.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/INFRA-21167
> > > > [2]
> > https://github.com/apache/spark/pkgs/container/apache-spark-ci-image
> > > > [3] https://github.com/apache/spark/pull/30623
> > > >
> > > > I can take a stab at this if the community thinks that this is a nice
> > > thing
> > > > to have.
> > > >
> > > > Thanks,
> > > > Vihang
> > > >
> > >
> >
>


[DISCUSS] Nightly snaphot builds

2023-05-22 Thread vihang karajgaonkar
Hello Team,

I have observed that it is a common use-case where users would like to test
out unreleased features/bug fixes either to unblock them or test out if the
bug fixes really work as intended in their environments. Today in the case
of Apache Hive, this is not very user friendly because it requires the end
user to build the binaries directly from the hive source code.

I found that Apache Spark has a very useful infrastructure [1] which
deploys nightly snapshots [2] [3] from the branch using github actions.
This is super useful for any user who wants to try out the latest and
greatest using the nightly builds.

I was wondering if we should also adopt this. We can use github actions to
upload the snapshot jars to the public repository (e.g github packages) and
schedule it as a nightly job.

[1] https://issues.apache.org/jira/browse/INFRA-21167
[2] https://github.com/apache/spark/pkgs/container/apache-spark-ci-image
[3] https://github.com/apache/spark/pull/30623

I can take a stab at this if the community thinks that this is a nice thing
to have.

Thanks,
Vihang


Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-05-09 Thread Vihang Karajgaonkar
Thanks Aman. I thought all the changes in release 3.2.0 were listed under
https://issues.apache.org/jira/browse/HIVE-26751 and I saw them all
resolved. Do you know which additional tickets need to go in branch-3 after
we backport the branch-3.1 fixes in branch-3?

On Tue, May 9, 2023 at 11:20 AM Aman Raj 
wrote:

> Hi Vihang,
>
> We only have 4 tickets remaining to be backported from branch-3.1 to
> branch-3. It will be completed next week.
>
> But there are a lot of new tickets that will go into release 3.2.0 on top
> of this. I was thinking of not cutting a release candidate now since it
> would mean that we only backport changes into that release candidate
> branch. This would again mean that if people commit only to branch-3 or the
> release branch, there will again be a lot of difference in these two
> branches when someone picks up the next release.
>
> Instead I am thinking that we should backport new changes to branch-3 and
> then only cut the release candidate. Please let me know your thoughts. If
> we agree that changes need to go into the new release candidate branch
> only, I am okay with that (I do not prefer it btw)
>
> Thanks,
> Aman.
>
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> 
> From: vihang karajgaonkar 
> Sent: Monday, May 8, 2023 4:57:24 AM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> Hi Aman,
>
> I know you are backporting the branch-3.1 commits to branch-3. How close
> are you with finishing with them. Is there anything that we can help with
> to get it over the finish line?
>
> I am interested to know how close are we to cutting the branch for 3.2.0?
> Do you think we can have a release candidate this week?
>
> Thanks,
> Vihang
>
> On Thu, Mar 30, 2023 at 2:18 AM Stamatis Zampetakis 
> wrote:
>
> > Huge thanks to everyone involved it is great to see the branch-3 in
> stable
> > state. As other people mentioned let's keep it that way!
> >
> > As far as it concerns back ports please be particularly cautious with
> > anything that touches the metastore schema and Thrift APIs.
> >
> > Best,
> > Stamatis
> >
> > On Wed, Mar 29, 2023, 4:36 AM vihang karajgaonkar 
> > wrote:
> >
> > > Thanks a lot Aman for all your efforts on this. Really appreciate the
> > > initiative and all your hard work on this.
> > >
> > > I would like to request that all the committers should follow the merge
> > > process of master branch to merge PRs in branch-3. If there are any
> test
> > > failures which seem unrelated, please do not ignore them. One can run
> the
> > > flaky
> > > test runner <
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fci.hive.apache.org%2Fjob%2Fhive-flaky-check%2F=05%7C01%7Crajaman%40microsoft.com%7C9f071a96a03e491757e908db4f52a621%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638190988697872822%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=NFcfXybje7zooPXkkN9UnSQIexxjKJ9wQR%2B0nrXk4e0%3D=0
> <http://ci.hive.apache.org/job/hive-flaky-check/>> to make
> > sure
> > > that test is indeed flaky. If the test is found to be flaky a
> > > ticket should be created to disable it. A separate ticket should be
> > created
> > > to deflake it and you can mention the original author or previous
> commit
> > > author who changed the test on that ticket to get help since they
> likely
> > > have the most context around that test. Once the flaky test is disabled
> > and
> > > we have a green CI job run, we should merge the PR. If others have any
> > > suggestions to improve this process please chime in.
> > >
> > > Thanks,
> > > Vihang
> > >
> > > On Tue, Mar 28, 2023 at 10:55 PM Aman Raj
>  > >
> > > wrote:
> > >
> > > > Hi community,
> > > >
> > > > This is to notify that we have a green branch-3 now. The entire
> effort
> > of
> > > > fixing branch-3 test cases took around 4 months and as a team we
> > managed
> > > to
> > > > fix 2900+ test failures on branch-3. The entire effort can be tracked
> > > here
> > > > HIVE-26836<
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26836=05%7C01%7Crajaman%40microsoft.com%7C9f071a96a03e491757e908db4f52a621%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638190988697872822%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300

Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-05-07 Thread vihang karajgaonkar
Hi Aman,

I know you are backporting the branch-3.1 commits to branch-3. How close
are you with finishing with them. Is there anything that we can help with
to get it over the finish line?

I am interested to know how close are we to cutting the branch for 3.2.0?
Do you think we can have a release candidate this week?

Thanks,
Vihang

On Thu, Mar 30, 2023 at 2:18 AM Stamatis Zampetakis 
wrote:

> Huge thanks to everyone involved it is great to see the branch-3 in stable
> state. As other people mentioned let's keep it that way!
>
> As far as it concerns back ports please be particularly cautious with
> anything that touches the metastore schema and Thrift APIs.
>
> Best,
> Stamatis
>
> On Wed, Mar 29, 2023, 4:36 AM vihang karajgaonkar 
> wrote:
>
> > Thanks a lot Aman for all your efforts on this. Really appreciate the
> > initiative and all your hard work on this.
> >
> > I would like to request that all the committers should follow the merge
> > process of master branch to merge PRs in branch-3. If there are any test
> > failures which seem unrelated, please do not ignore them. One can run the
> > flaky
> > test runner <http://ci.hive.apache.org/job/hive-flaky-check/> to make
> sure
> > that test is indeed flaky. If the test is found to be flaky a
> > ticket should be created to disable it. A separate ticket should be
> created
> > to deflake it and you can mention the original author or previous commit
> > author who changed the test on that ticket to get help since they likely
> > have the most context around that test. Once the flaky test is disabled
> and
> > we have a green CI job run, we should merge the PR. If others have any
> > suggestions to improve this process please chime in.
> >
> > Thanks,
> > Vihang
> >
> > On Tue, Mar 28, 2023 at 10:55 PM Aman Raj  >
> > wrote:
> >
> > > Hi community,
> > >
> > > This is to notify that we have a green branch-3 now. The entire effort
> of
> > > fixing branch-3 test cases took around 4 months and as a team we
> managed
> > to
> > > fix 2900+ test failures on branch-3. The entire effort can be tracked
> > here
> > > HIVE-26836<https://issues.apache.org/jira/browse/HIVE-26836>. We are
> > > ready to push new features and improvements on branch-3 now.
> > >
> > > I really want to thank Vihang Karajgaonkar, Chris Nauroth, Lazlo Bodor,
> > > Stamatis Zampetakis and Sankar Hariappan without whom this would not at
> > all
> > > have been possible. As a team we stuck together and participated in
> > reviews
> > > and actively suggested improvements which really helped in fixing some
> > > major test failures.
> > >
> > > I would sincerely request that going further it should be made a point
> to
> > > merge things into branch-3 only if we have a green Jenkins pipeline.
> > >
> > > The next step would be to backport changes from branch-3.1 (From where
> > > Hive-3.1.3 release was made) to branch-3. This would ensure that we do
> > not
> > > miss any specific ticket which went into Hive-3.1.3. I will take care
> of
> > > this. We can parallelly start pushing additional changes on branch-3.
> > There
> > > are approximately 25 tickets that need to be backported in this effort
> > (Of
> > > backporting changes from branch-3.1). I have made a note here<
> > >
> >
> https://docs.google.com/spreadsheets/d/1K0U-vxLRZEs13oBzYBlVyK8dMMNthgXL5VEgzLRbeKs/edit?usp=sharing
> > > >
> > >
> > > Again, thanks a lot to everyone who supported and participated in this
> > > effort. Lets make this 3.2.0 Hive release happen!!
> > >
> > > Thanks,
> > > Aman.
> > >
> > > 
> > > From: Aman Raj 
> > > Sent: Monday, March 20, 2023 9:21 AM
> > > To: dev@hive.apache.org 
> > > Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
> > >
> > > Hi Vihang/community,
> > >
> > > Found the ticket which broke mm_all.q. This issue comes because of
> > > HIVE-20182. Works in my local and on the Jenkins pipeline as well.
> Link :
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Fpull%2F4127=05%7C01%7Crajaman%40microsoft.com%7C043f385c28ce4867174208db28f66afd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638148811080483635%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=XSPlEtfWDNV%2Fccv9Q33xUtM

[jira] [Created] (HIVE-27202) Disable flaky test TestJdbcWithMiniLlapRow#testComplexQuery

2023-03-31 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27202:
--

 Summary: Disable flaky test 
TestJdbcWithMiniLlapRow#testComplexQuery
 Key: HIVE-27202
 URL: https://issues.apache.org/jira/browse/HIVE-27202
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


TestJdbcWithMiniLlapRow#testComplexQuery is flaky and should be disabled.

 

http://ci.hive.apache.org/job/hive-flaky-check/634/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-03-28 Thread vihang karajgaonkar
Thanks a lot Aman for all your efforts on this. Really appreciate the
initiative and all your hard work on this.

I would like to request that all the committers should follow the merge
process of master branch to merge PRs in branch-3. If there are any test
failures which seem unrelated, please do not ignore them. One can run the flaky
test runner <http://ci.hive.apache.org/job/hive-flaky-check/> to make sure
that test is indeed flaky. If the test is found to be flaky a
ticket should be created to disable it. A separate ticket should be created
to deflake it and you can mention the original author or previous commit
author who changed the test on that ticket to get help since they likely
have the most context around that test. Once the flaky test is disabled and
we have a green CI job run, we should merge the PR. If others have any
suggestions to improve this process please chime in.

Thanks,
Vihang

On Tue, Mar 28, 2023 at 10:55 PM Aman Raj 
wrote:

> Hi community,
>
> This is to notify that we have a green branch-3 now. The entire effort of
> fixing branch-3 test cases took around 4 months and as a team we managed to
> fix 2900+ test failures on branch-3. The entire effort can be tracked here
> HIVE-26836<https://issues.apache.org/jira/browse/HIVE-26836>. We are
> ready to push new features and improvements on branch-3 now.
>
> I really want to thank Vihang Karajgaonkar, Chris Nauroth, Lazlo Bodor,
> Stamatis Zampetakis and Sankar Hariappan without whom this would not at all
> have been possible. As a team we stuck together and participated in reviews
> and actively suggested improvements which really helped in fixing some
> major test failures.
>
> I would sincerely request that going further it should be made a point to
> merge things into branch-3 only if we have a green Jenkins pipeline.
>
> The next step would be to backport changes from branch-3.1 (From where
> Hive-3.1.3 release was made) to branch-3. This would ensure that we do not
> miss any specific ticket which went into Hive-3.1.3. I will take care of
> this. We can parallelly start pushing additional changes on branch-3. There
> are approximately 25 tickets that need to be backported in this effort (Of
> backporting changes from branch-3.1). I have made a note here<
> https://docs.google.com/spreadsheets/d/1K0U-vxLRZEs13oBzYBlVyK8dMMNthgXL5VEgzLRbeKs/edit?usp=sharing
> >
>
> Again, thanks a lot to everyone who supported and participated in this
> effort. Lets make this 3.2.0 Hive release happen!!
>
> Thanks,
> Aman.
>
> 
> From: Aman Raj 
> Sent: Monday, March 20, 2023 9:21 AM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> Hi Vihang/community,
>
> Found the ticket which broke mm_all.q. This issue comes because of
> HIVE-20182. Works in my local and on the Jenkins pipeline as well. Link :
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Fpull%2F4127=05%7C01%7Crajaman%40microsoft.com%7C043f385c28ce4867174208db28f66afd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638148811080483635%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=XSPlEtfWDNV%2Fccv9Q33xUtMLuhvxHx3CD4kC%2F5mWj2Y%3D=0
> <https://github.com/apache/hive/pull/4127> Reverting this commit for now.
>
> Thanks,
> Aman.
> 
> From: Aman Raj 
> Sent: Monday, March 20, 2023 8:28 AM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> Sure Vihang, will look at the other ones. You can pick this up.
>
> Thanks,
> Aman.
>
> Get Outlook for Android<
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg=05%7C01%7Crajaman%40microsoft.com%7C043f385c28ce4867174208db28f66afd%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638148811080483635%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=6qSGgiuKc1UyzmmYo3Tcok%2BSuOiFBdF4lfXv%2FAeuZbs%3D=0
> <https://aka.ms/AAb9ysg>>
> 
> From: vihang karajgaonkar 
> Sent: Monday, March 20, 2023 7:58:48 AM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> I think we should revert offending commits first to unblock the branch. We
> can create followup tickets to determine if these fixes are blockers for
> 3.2 release and if yes, we should merge them the right way with a green
> test run. Fixing forward always comes with the risk that it introduces new
> test failures.
>
> Thanks for all your efforts on this Aman.
>
> I can take a look at testBootstrapReplLoadRetryAfterFailureF

[jira] [Created] (HIVE-27175) Fix TestJdbcDriver2#testSelectExecAsync2

2023-03-25 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27175:
--

 Summary: Fix TestJdbcDriver2#testSelectExecAsync2
 Key: HIVE-27175
 URL: https://issues.apache.org/jira/browse/HIVE-27175
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


TestJdbcDriver2#testSelectExecAsync2 is failing on branch-3. We need to 
backport HIVE-20897 to fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27171) Backport HIVE-20680 to branch-3

2023-03-24 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27171:
--

 Summary: Backport HIVE-20680 to branch-3
 Key: HIVE-27171
 URL: https://issues.apache.org/jira/browse/HIVE-27171
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


We need to backport HIVE-26836 to fix the 
TestReplicationScenariosAcrossInstances on branch-3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27154) Fix testBootstrapReplLoadRetryAfterFailureForPartitions

2023-03-19 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27154:
--

 Summary: Fix testBootstrapReplLoadRetryAfterFailureForPartitions
 Key: HIVE-27154
 URL: https://issues.apache.org/jira/browse/HIVE-27154
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


`testBootstrapReplLoadRetryAfterFailureForPartitions` has been failing on 
branch-3

 

http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4067/12/tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-03-19 Thread vihang karajgaonkar
I think we should revert offending commits first to unblock the branch. We
can create followup tickets to determine if these fixes are blockers for
3.2 release and if yes, we should merge them the right way with a green
test run. Fixing forward always comes with the risk that it introduces new
test failures.

Thanks for all your efforts on this Aman.

I can take a look at testBootstrapReplLoadRetryAfterFailureForPartitions if
you haven’t already started on it.

Thanks,
Vihang

On Sun, Mar 19, 2023 at 10:09 PM Aman Raj 
wrote:

> Hi Vihang/community,
>
> Thanks a lot Vihang for working on the major test failure. This blocked
> more than 35 test cases. Now we are down to the final 4 failures. I have
> analyzed some of them and here they are  (Link :
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4067/12/tests)
> :
>
>   1.
> multi_in_clause - This was committed in HIVE-21685 without validating the
> scenario.
> This fails because Hive is not able to parse
> explain cbo
> select * from very_simple_table_for_in_test where name IN('g','r') AND
> name IN('a','b')
> If we want this to work, I am able to do it in my local. We have 2 options
> :
> a. Either revert HIVE-21685 since this scenario was not validated back
> then before adding this test.
> b. This fix was present in
> https://issues.apache.org/jira/browse/HIVE-20718 but to cherry pick this
> we need to cherry pick https://issues.apache.org/jira/browse/HIVE-17040
> since HIVE-20718<https://issues.apache.org/jira/browse/HIVE-20718> has a
> lot of merge conflicts with  HIVE-17040<
> https://issues.apache.org/jira/browse/HIVE-17040>. But after cherry
> picking these we have other failures to fix.
>   2.
> current_date_timestamp.q - This breaking change was committed in
> HIVE-21388 without validation.
> The failure is because again Hive is not able to parse
> explain cbo select current_timestamp() from alltypesorc
> The solution or revert option is same as point 1.
>   3.
> testBootstrapReplLoadRetryAfterFailureForPartitions() - This I have not
> investigated till now.
>   4.
> mm_all.q - This I have not investigated till now.
>
> Thanks,
> Aman.
> 
> From: vihang karajgaonkar 
> Sent: Friday, March 17, 2023 8:42 PM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> Just wanted to close the loop on the TestMiniSparkOnYarnCliDriver test
> failures. We will be able to re-enable most of them back on branch-3. The
> ones which were disabled are being tracked separately in a different ticket
> <
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-27146=05%7C01%7Crajaman%40microsoft.com%7Cfe96faae91f8418ecaa108db26fa0a5e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638146627636747901%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=C19is4AtBNH04Dm1F1bwp4wVw6erFn736e47p6STrzE%3D=0>
> but they don't look like
> a blocker.
>
> Hi Aman,
>
> Do you know how close are we to reopening branch-3?
>
> Thanks,
> Vihang
>
> On Sat, Mar 4, 2023 at 7:23 PM Aman Raj 
> wrote:
>
> > Or you can cd into itests and run the command you are using. Just another
> > way I run.
> >
> > Thanks,
> > Aman.
> > Get Outlook for Android<
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg=05%7C01%7Crajaman%40microsoft.com%7Cfe96faae91f8418ecaa108db26fa0a5e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638146627636747901%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=nAL14KzxAWwQAV5WJmfkBgaJh0M0wPwq5qORrXcQ6fk%3D=0
> >
> > 
> > From: Aman Raj 
> > Sent: Saturday, March 4, 2023 7:20:36 PM
> > To: dev@hive.apache.org 
> > Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
> >
> > Hi Vihang,
> >
> > Thanks a lot for working on this. Can you try using -Pqsplits,itests.
> > Also, I usually give a -o option after doing a clean install.
> >
> > Thanks,
> > Aman.
> >
> > Get Outlook for Android<
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg=05%7C01%7Crajaman%40microsoft.com%7Cfe96faae91f8418ecaa108db26fa0a5e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638146627636747901%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=nAL14KzxAWwQAV5WJmfkBgaJh0M0wPwq5qORrXcQ6fk%3D=0
> >
> >
> > 
> > From: vihang karajgaonkar 
> > Sent: Saturday, 4 Marc

Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-03-17 Thread vihang karajgaonkar
Just wanted to close the loop on the TestMiniSparkOnYarnCliDriver test
failures. We will be able to re-enable most of them back on branch-3. The
ones which were disabled are being tracked separately in a different ticket
<https://issues.apache.org/jira/browse/HIVE-27146> but they don't look like
a blocker.

Hi Aman,

Do you know how close are we to reopening branch-3?

Thanks,
Vihang

On Sat, Mar 4, 2023 at 7:23 PM Aman Raj 
wrote:

> Or you can cd into itests and run the command you are using. Just another
> way I run.
>
> Thanks,
> Aman.
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> 
> From: Aman Raj 
> Sent: Saturday, March 4, 2023 7:20:36 PM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> Hi Vihang,
>
> Thanks a lot for working on this. Can you try using -Pqsplits,itests.
> Also, I usually give a -o option after doing a clean install.
>
> Thanks,
> Aman.
>
> Get Outlook for Android<https://aka.ms/AAb9ysg>
>
> 
> From: vihang karajgaonkar 
> Sent: Saturday, 4 March, 2023, 11:35
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> [You don't often get email from vihan...@apache.org. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Just to update on the HoS test failures for TestMiniSparkOnYarnCliDriver, I
> think I was finally able to resolve them (at least on local). I had to
> revert HIVE-21044 because it was causing OOM for those tests. Also, in
> order for these tests to work we will have to downgrade netty from
> 4.1.69.Final to 4.1.51.Final. I understand that we had upgraded netty from
> 4.1.17.Final to 4.1.69.Final for CVEs but the highest netty version that we
> can support without breaking HoS is 4.1.51.Final. Note that 4.1.51.Final
> includes many of the CVEs which affected 4.1.17.Final so we are still in a
> better place than branch-3.1. Unfortunately, there is no good way to make
> HoS work with a higher netty version so I think we should downgrade the
> netty version to 4.1.51.Final for now and look at more options to upgrade
> it 4.1.69.Final in a separate ticket.
>
> I still need to understand why the tests which are working for me locally
> don't work on the PR job. I tried running the split test classes using the
> following command. Is that the right way to simulate builds from the PR
> job? Let me know if anyone has more ideas.
>
> mvn test
> -Dtest=org.apache.hadoop.hive.cli.split2.TestMiniSparkOnYarnCliDriver
> -Pqsplits
>
> Thanks,
> Vihang
>
>
> On Fri, Feb 17, 2023 at 4:01 AM Stamatis Zampetakis 
> wrote:
>
> > Hello,
> >
> > Thanks Aman for bringing this up and also for cleaning up after others (I
> > saw that you raised tickets and PRs for addressing the failures).
> >
> > Many thanks to Vihang as well for helping out. Regarding flaky tests, yes
> > we should disable them as soon as we see them.
> > There have been some other discussions on how to approach flaky tests the
> > more recent I could find is here [1].
> >
> > Best,
> > Stamatis
> >
> > [1]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.apache.org%2Fthread%2Flv3bhlfoq8fwd9dwyjf7g4nx32wtrygv=05%7C01%7Crajaman%40microsoft.com%7C24312f2572754c8a428908db1c76210e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638135067023705364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000%7C%7C%7C=vB4E9RakrfYFCHGsxque1mnx9gb06JEXuuW2LJTzttM%3D=0
> >
> > On Fri, Feb 17, 2023 at 4:37 AM Aman Raj 
> > wrote:
> >
> > > Hi team,
> > >
> > > Thanks Vihang for looking into this. I have commented on the JIRA you
> > > created.
> > >
> > > Just to bring everyone's notice, I have seen that there has been a
> couple
> > > of pushes to branch-3, which has lead to 5 more new test failures. The
> > test
> > > failures are in orc_merge1, orc_merge2, orc_merge3, orc_merge4 and
> > > orc_merge10. These tests did not use to fail before. I would sincerely
> > urge
> > > the community to raise a PR against branch-3, so that the Jenkins
> > pipeline
> > > can run and then only merge things to branch-3. We had 2900+ failures
> > when
> > > we started 2 months back and now having brought it down to less than
> 15,
> > > new failures again has pushed us back in this effort.
> > >
> > > I would like to thank everyone who has participated in this effort and
> > > made it possible till this stage. Also, if th

[jira] [Created] (HIVE-27148) Disable TestJdbcGenericUDTFGetSplits

2023-03-16 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27148:
--

 Summary: Disable TestJdbcGenericUDTFGetSplits
 Key: HIVE-27148
 URL: https://issues.apache.org/jira/browse/HIVE-27148
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Vihang Karajgaonkar


TestJdbcGenericUDTFGetSplits is flaky and intermittently fails.

http://ci.hive.apache.org/job/hive-flaky-check/614/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27146) Re-enable orc_merge*.q tests for TestMiniSparkOnYarnCliDriver

2023-03-16 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27146:
--

 Summary: Re-enable orc_merge*.q tests for 
TestMiniSparkOnYarnCliDriver
 Key: HIVE-27146
 URL: https://issues.apache.org/jira/browse/HIVE-27146
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar


It was found that the q.out file for these tests fail with a diff in the 
replication factor of the files. The tests only fail on the CI job so it is 
possible that it is due to some test environment issues. The tests also fail on 
3.1.3 release.

E.g orc_merge4.q fails with the error. Similarly the other tests fail with the 
same difference in replication factor.
{code:java}
40c40
< -rw-r--r--   1 ### USER ### ### GROUP ###   2530 ### HDFS DATE ### 
hdfs://### HDFS PATH ###
---
> -rw-r--r--   3 ### USER ### ### GROUP ###   2530 ### HDFS DATE ### 
> hdfs://### HDFS PATH ###
66c66
< -rw-r--r--   1 ### USER ### ### GROUP ###   2530 ### HDFS DATE ### 
hdfs://### HDFS PATH ###
---
> -rw-r--r--   3 ### USER ### ### GROUP ###   2530 ### HDFS DATE ### 
> hdfs://### HDFS PATH ###
68c68
< -rw-r--r--   1 ### USER ### ### GROUP ###   2530 ### HDFS DATE ### 
hdfs://### HDFS PATH ###
---
> -rw-r--r--   3 ### USER ### ### GROUP ###   2530 ### HDFS DATE ### 
> hdfs://### HDFS PATH ###
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-03-03 Thread vihang karajgaonkar
Just to update on the HoS test failures for TestMiniSparkOnYarnCliDriver, I
think I was finally able to resolve them (at least on local). I had to
revert HIVE-21044 because it was causing OOM for those tests. Also, in
order for these tests to work we will have to downgrade netty from
4.1.69.Final to 4.1.51.Final. I understand that we had upgraded netty from
4.1.17.Final to 4.1.69.Final for CVEs but the highest netty version that we
can support without breaking HoS is 4.1.51.Final. Note that 4.1.51.Final
includes many of the CVEs which affected 4.1.17.Final so we are still in a
better place than branch-3.1. Unfortunately, there is no good way to make
HoS work with a higher netty version so I think we should downgrade the
netty version to 4.1.51.Final for now and look at more options to upgrade
it 4.1.69.Final in a separate ticket.

I still need to understand why the tests which are working for me locally
don't work on the PR job. I tried running the split test classes using the
following command. Is that the right way to simulate builds from the PR
job? Let me know if anyone has more ideas.

mvn test
-Dtest=org.apache.hadoop.hive.cli.split2.TestMiniSparkOnYarnCliDriver
-Pqsplits

Thanks,
Vihang


On Fri, Feb 17, 2023 at 4:01 AM Stamatis Zampetakis 
wrote:

> Hello,
>
> Thanks Aman for bringing this up and also for cleaning up after others (I
> saw that you raised tickets and PRs for addressing the failures).
>
> Many thanks to Vihang as well for helping out. Regarding flaky tests, yes
> we should disable them as soon as we see them.
> There have been some other discussions on how to approach flaky tests the
> more recent I could find is here [1].
>
> Best,
> Stamatis
>
> [1] https://lists.apache.org/thread/lv3bhlfoq8fwd9dwyjf7g4nx32wtrygv
>
> On Fri, Feb 17, 2023 at 4:37 AM Aman Raj 
> wrote:
>
> > Hi team,
> >
> > Thanks Vihang for looking into this. I have commented on the JIRA you
> > created.
> >
> > Just to bring everyone's notice, I have seen that there has been a couple
> > of pushes to branch-3, which has lead to 5 more new test failures. The
> test
> > failures are in orc_merge1, orc_merge2, orc_merge3, orc_merge4 and
> > orc_merge10. These tests did not use to fail before. I would sincerely
> urge
> > the community to raise a PR against branch-3, so that the Jenkins
> pipeline
> > can run and then only merge things to branch-3. We had 2900+ failures
> when
> > we started 2 months back and now having brought it down to less than 15,
> > new failures again has pushed us back in this effort.
> >
> > I would like to thank everyone who has participated in this effort and
> > made it possible till this stage. Also, if the contributors can take
> > ownership of these new test case failures and fix them, it will be of
> great
> > help.
> >
> > Thanks,
> > Aman.
> > 
> > From: vihang karajgaonkar 
> > Sent: Friday, February 17, 2023 6:10 AM
> > To: dev@hive.apache.org 
> > Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
> >
> > [You don't often get email from vihan...@apache.org. Learn why this is
> > important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > Hi Aman,
> >
> > I created
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-27087=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=E7FD0nKrKQq%2F297DlTgJog365lH4Q0Xa8I2zEGgwtQY%3D=0
> > to look into
> > TestMiniSparkOnYarnCliDriver failures. I have a working theory of what
> > might be going on there. I am still investigating what is the right way
> to
> > fix it though.
> >
> > Thanks,
> > Vihang
> >
> > On Fri, Feb 10, 2023 at 10:26 AM Aman Raj  >
> > wrote:
> >
> > > Hi Vihang,
> > >
> > > Yes the tests are failing locally as well with the same issue.
> > >
> > > Thanks,
> > > Aman.
> > >
> > > Get Outlook for Android<
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg=05%7C01%7Crajaman%40microsoft.com%7C7cc87475f1fe4036bcd308db107faf36%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638121912852386975%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=XbUx9nnHQKtIdemDWtNB8W%2BoAN9r997WjFOZlJLhBH8%3D=0
> > >
> > > 
> > > From: Vihang Karajgaonkar 
> > &g

Re: [EXTERNAL] Re: Proposal to deprecate Hive on Spark from branch-3

2023-02-27 Thread vihang karajgaonkar
I think 3.2.0 is seen as a minor release, not a maintenance release. Eg.
3.2.1 would be a maintenance release which typically includes bug fixes,
minor usability improvements and security fixes. If we block new features
from going into minor releases that would be a step back in my opinion.
I am pretty sure we have released small features in minor releases before
(e.g 2.2.0 release notes
<https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12335837=Text=12310843>).
I would request not to club new features and breaking features together in
the same decision.

For breaking changes, I don't think we have published guidelines before so
I am open to that. Especially for the HoS tests which are failing, some of
the failures are related to dependency upgrades (done for security reasons)
while some have no clear root-cause yet. For a feature which has been
removed in master branch, I think it is okay to deprecate it and disable
the tests given that we have spent considerable time on them already and
none of the original HoS contributors are available to help.

Thanks,
Vihang

On Mon, Feb 27, 2023 at 1:24 PM Stamatis Zampetakis 
wrote:

> Some people raised a valid point that branch-3 is a maintenance branch. If
> we really aim 3.2.0 to be a maintenance release then we should minimize
> breaking changes and prohibit new features. In this case Spark cannot go
> away and the only thing we can do is deprecate it. It also means that we
> should fix the tests cause failures typically indicate breaking changes
> which again are not tolerable for a maintenance release.
>
> On the other hand,
> I got the impression that some people were interested for getting new
> features in 3.2.0 (  some may be in already). Furthermore, some dependency
> upgrades may also lead to breaking changes/different behavior so we should
> definitely agree on what is acceptable and what is not for branch-3.
>
> Summing up the question boils down to the following. Do we allow breaking
> changes and new features in branch-3 or not?
>
> Best,
> Stamatis
>
> On Fri, Feb 24, 2023, 10:41 AM Aman Raj 
> wrote:
>
> > Hi Laszlo,
> >
> > I am perfectly fine with disabling the Hive on Spark tests. In fact, I
> > prefer that. I agree with Vihang and you on this. I had proposed this
> idea
> > long back (of disabling the test cases) and then we had discussed on the
> > community that either we fix the Hive on Spark test cases or remove Hive
> on
> > Spark. Therefore, I initiated this thread of removing Hive on Spark since
> > we are not still able to resolve the test cases since the past couple of
> > months.
> >
> > Thanks,
> > Aman.
> >
> > 
> > From: László Bodor 
> > Sent: Friday, February 24, 2023 2:57 PM
> > To: dev@hive.apache.org 
> > Subject: Re: [EXTERNAL] Re: Proposal to deprecate Hive on Spark from
> > branch-3
> >
> > +1 on Vihang's suggestion
> > I remember that spark removal was a debated thing even on master, so
> > completely removing it backwards from a "maintenance" branch-3 line is
> not
> > really acceptable (actually, I'm surprised it's not -1ed yet by hive on
> > spark folks), but it depends on what *deprecation* really means: I mean
> > disabling some spark tests to stabilize precommit is completely fine in
> the
> > absence of community aspiration to fix them properly
> >
> > regarding the motivation: "This would ensure that branch-3 is aligned
> with
> > the master as done in ..."  <-- I don't think we're targeting this, we
> are
> > about to make 3.x releases as simply as possible
> >
> > I'm hoping/assuming that most of the +1s so far are in line with Vihang's
> > suggestion
> >
> > vihang karajgaonkar  ezt írta (időpont: 2023. febr.
> > 23., Cs, 16:37):
> >
> > > +1 to deprecate Hive on Spark.
> > >
> > > I feel directly removing it in a minor release is probably a bad idea.
> > Most
> > > users will upgrade to 3.2 first and go to 4.0 later. If we deprecate it
> > in
> > > 3.2 it transitions well into its removal as users upgrade to 4.0
> > > eventually.
> > >
> > > If the goal to stabilize the branch-3, we can disable the failing tests
> > on
> > > Hive on Spark.
> > >
> > > Thanks,
> > > Vihang
> > >
> > > On Thu, Feb 23, 2023 at 12:32 AM Alessandro Solimando <
> > > alessandro.solima...@gmail.com> wrote:
> > >
> > > > +1 from me too
> > > >
> > > > On Thu, 23 Feb 2023 at 06:09, Ayush Saxena 
> wrote:
> > > >
> > > > >

Re: [EXTERNAL] Re: Proposal to deprecate Hive on Spark from branch-3

2023-02-23 Thread vihang karajgaonkar
+1 to deprecate Hive on Spark.

I feel directly removing it in a minor release is probably a bad idea. Most
users will upgrade to 3.2 first and go to 4.0 later. If we deprecate it in
3.2 it transitions well into its removal as users upgrade to 4.0 eventually.

If the goal to stabilize the branch-3, we can disable the failing tests on
Hive on Spark.

Thanks,
Vihang

On Thu, Feb 23, 2023 at 12:32 AM Alessandro Solimando <
alessandro.solima...@gmail.com> wrote:

> +1 from me too
>
> On Thu, 23 Feb 2023 at 06:09, Ayush Saxena  wrote:
>
> > +1 on removing Hive on Spark from branch-3
> >
> > -Ayush
> >
> > > On 23-Feb-2023, at 6:40 AM, Wang, Yuming 
> > wrote:
> > >
> > > +1.
> > >
> > > From: Naresh P R 
> > > Date: Thursday, February 23, 2023 at 02:49
> > > To: dev@hive.apache.org 
> > > Subject: Re: [EXTERNAL] Re: Proposal to deprecate Hive on Spark from
> > branch-3
> > > External Email
> > >
> > > +1 to remove Hive on Spark in branch-3
> > > ---
> > > Regards,
> > > Naresh P R
> > >
> > >> On Wed, Feb 22, 2023 at 5:37 AM Sankar Hariappan
> > >>  wrote:
> > >>
> > >> +1, to remove Hive on Spark in branch-3.
> > >>
> > >> Thanks,
> > >> Sankar
> > >>
> > >> -Original Message-
> > >> From: Rajesh Balamohan 
> > >> Sent: Wednesday, February 22, 2023 6:58 PM
> > >> To: dev@hive.apache.org
> > >> Subject: [EXTERNAL] Re: Proposal to deprecate Hive on Spark from
> > branch-3
> > >>
> > >> +1 on removing Hive on Spark in branch-3.
> > >>
> > >> It was not done earlier since it was removing a feature in the branch.
> > But
> > >> if there is enough consensus, we should consider removing it.
> > >>
> > >> ~Rajesh.B
> > >>
> > >> On Wed, Feb 22, 2023 at 12:48 PM Aman Raj
>  > >
> > >> wrote:
> > >>
> > >>> Hi team,
> > >>>
> > >>> We have been trying to fix Hive on Spark test failures for a long
> > >>> time. As of now, branch-3 has less than 12 test failures (whose fix
> > >>> have not been identified). 8 of them are related to Hive on Spark. I
> > >>> had mailed about the failures in my previous mail threads. Thanks to
> > >>> Vihang for working on them as well. But we have not been able to
> > >> identify the root cause till now.
> > >>> These fixes can be tracked in the following tickets : [HIVE-27087]
> Fix
> > >>> TestMiniSparkOnYarnCliDriver test failures on branch-3 - ASF JIRA (
> > >>> apache.org)<
> >
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnam06.safelinks.protection.outlook.com%2F%3Furl%3Dhttps%2525=05%7C01%7Cyumwang%40ebay.com%7C2bd54cc0c84a4e44a59e08db150574e5%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C638126885411646147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=KORvfBkmdpqIFOoWr6J4X%2BqAQO6jcykzjY3%2FU0pq0y4%3D=0
> > >>> 3A%2F%2Fissues.apache.org
> %2Fjira%2Fbrowse%2FHIVE-27087=05%7C01%7C
> > >>> Sankar.Hariappan%40microsoft.com
> %7C687a6a4dbd41454568e008db14d8cc23%7C
> > >>>
> 72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638126693641861742%7CUnknow
> > >>>
> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLC
> > >>>
> JXVCI6Mn0%3D%7C3000%7C%7C%7C=RbAqrwK6fQFDStufXYfpusNc81EzjtpiaHm
> > >>> qv5CFiAs%3D=0> and [HIVE-26940] Backport of HIVE-19882 : Fix
> > >>> QTestUtil session lifecycle - ASF JIRA
> > >>> (apache.org)<
> >
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhttps%2F=05%7C01%7Cyumwang%40ebay.com%7C2bd54cc0c84a4e44a59e08db150574e5%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C638126885411646147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=nmosJ%2FQ2JM1UBDuightSWLL9haNQFuc24Zkvo4RnUh4%3D=0
> > >>> %3A%2F%2Fissues.apache.org
> %2Fjira%2Fbrowse%2FHIVE-26940=05%7C01%7
> > >>> CSankar.Hariappan%40microsoft.com
> %7C687a6a4dbd41454568e008db14d8cc23%7
> > >>>
> C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638126693641861742%7CUnkno
> > >>>
> wn%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL
> > >>>
> CJXVCI6Mn0%3D%7C3000%7C%7C%7C=PaDtEZD569Sv0ER9sC4l6q1ZxyoBeER3zn
> > >>> Bsc51PWI8%3D=0>
> > >>>
> > >>> Until we have a green branch-3, we cannot go ahead to push new
> > >>> features for the Hive-3.2.0 release. This is kind of a blocker for
> this
> > >> release.
> > >>> Already bringing the test fixes to the current state took more than 2
> > >>> months.
> > >>>
> > >>> I wanted to bring up a proposal to deprecate Hive on Spark from
> > >>> branch-3 altogether. This would ensure that branch-3 is aligned with
> > >>> the master as done in
> > >>>
> >
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissu%2F=05%7C01%7Cyumwang%40ebay.com%7C2bd54cc0c84a4e44a59e08db150574e5%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C638126885411646147%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=n51cF5fFuwSFFhX%2B0S828W3jYN3G3YwRwJWne1AMGtg%3D=0
> > >>> es.apache.org
> > >> 

[jira] [Created] (HIVE-27092) Reenable flaky test TestRpc

2023-02-17 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27092:
--

 Summary: Reenable flaky test TestRpc
 Key: HIVE-27092
 URL: https://issues.apache.org/jira/browse/HIVE-27092
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-02-16 Thread vihang karajgaonkar
Hi Aman,

I created https://issues.apache.org/jira/browse/HIVE-27087 to look into
TestMiniSparkOnYarnCliDriver failures. I have a working theory of what
might be going on there. I am still investigating what is the right way to
fix it though.

Thanks,
Vihang

On Fri, Feb 10, 2023 at 10:26 AM Aman Raj 
wrote:

> Hi Vihang,
>
> Yes the tests are failing locally as well with the same issue.
>
> Thanks,
> Aman.
>
> Get Outlook for Android<https://aka.ms/AAb9ysg>
> ____
> From: Vihang Karajgaonkar 
> Sent: Friday, February 10, 2023 11:22:15 PM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Branch-3 backports and build stability
>
> [You don't often get email from vihang.karajgaon...@databricks.com.invalid.
> Learn why this is important at
> https://aka.ms/LearnAboutSenderIdentification ]
>
> Thanks a lot Stamatis for starting this thread. I really appreciate all the
> efforts to stabilize branch-3 to get it to a releasable state and I agree
> that we should get it to a green state before opening it for PRs not
> related to test failures. I can help with the effort as well.
>
> If we want to get the branch back to green state soon, have we considered
> disabling the tests which are clearly flaky? (e.g pass on some builds and
> fail on the other build with no new code changes). If we don't do that, we
> will keep playing whack a mole with those tests. I propose for such tests
> we should disable them and create tickets to unflake them separately. This
> will help us get back to a green state faster.
>
> Hi Aman,
> For TestMiniSparkOnYarnCliDriver failures, you probably should also look
> into the spark driver/application logs and see if there are infrastructure
> errors (e.g OOMs). Are these tests failing when you run locally?
>
> Thanks,
> Vihang
>
> On Tue, Feb 7, 2023 at 10:05 PM Aman Raj 
> wrote:
>
> > +1,
> > Thanks Stamatis and Lazlo for helping in the test case fixes till now.
> >
> > Team,
> > I need help in fixing the following tests in Hive. I have tried different
> > approaches but no luck till now.
> > I am facing some issues in fixing the following tests :
> > org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
> >
> > Issue :
> > PREHOOK: Input: default@src
> > PREHOOK: Output: default@src
> > Failed to monitor Job[-1] with exception
> > 'java.lang.IllegalStateException(Connection to remote Spark driver was
> > lost)' Last known state = SENT
> > Failed to execute spark task, with exception
> > 'java.lang.IllegalStateException(RPC channel is closed.)'
> > FAILED: Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.spark.SparkTask. RPC channel is closed.
> >
> > History :
> > Initially the tests had failed with errors which I fixed in the following
> > task :
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26940=05%7C01%7Crajaman%40microsoft.com%7C8ab90a50295341aa10f808db0b8f9959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638116483653266848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=rNJF2%2BdnjYOzBsOn3nQO9UBeVLDctMOvNzJ%2BetpghPA%3D=0
> >
> > Does anyone know what the issue is here ? There are 6-7 failures because
> > of this test case. Link to the failed test cases for the stacktrace :
> >
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fci.hive.apache.org%2Fblue%2Forganizations%2Fjenkins%2Fhive-precommit%2Fdetail%2FPR-3949%2F2%2Ftests%2F=05%7C01%7Crajaman%40microsoft.com%7C8ab90a50295341aa10f808db0b8f9959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638116483653266848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=bGQ725R7D6bLTIr7eiTbGlmDNC0WBV2N4j4JRuffed4%3D=0
> > Thanks,
> > Aman.
> >
> > 
> > From: László Bodor 
> > Sent: Tuesday, February 7, 2023 4:46 PM
> > To: dev@hive.apache.org 
> > Subject: [EXTERNAL] Re: Branch-3 backports and build stability
> >
> > +1
> > also, if I merged something that I thought was for test stability (but
> > instead it was a feature), excuse me :)
> > for reference, the whole green test initiative is tracked under this
> > umbrella:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26836=05%7C01%7Crajaman%40microsoft.com%7C8ab90a50295341aa10f808db0b8f9959%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638116483653266848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=mISU

[jira] [Created] (HIVE-27087) Fix TestMiniSparkOnYarnCliDriver test failures on branch-3

2023-02-16 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27087:
--

 Summary: Fix TestMiniSparkOnYarnCliDriver test failures on branch-3
 Key: HIVE-27087
 URL: https://issues.apache.org/jira/browse/HIVE-27087
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


TestMiniSparkOnYarnCliDriver are failing with the error below

[ERROR] 2023-02-16 14:13:08.991 [Driver] SparkContext - Error initializing 
SparkContext.
java.lang.RuntimeException: java.lang.NoSuchFieldException: 
DEFAULT_TINY_CACHE_SIZE
at 
org.apache.spark.network.util.NettyUtils.getPrivateStaticField(NettyUtils.java:131)
 ~[spark-network-common_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.network.util.NettyUtils.createPooledByteBufAllocator(NettyUtils.java:118)
 ~[spark-network-common_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.network.server.TransportServer.init(TransportServer.java:94) 
~[spark-network-common_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.network.server.TransportServer.(TransportServer.java:73) 
~[spark-network-common_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.network.TransportContext.createServer(TransportContext.java:114)
 ~[spark-network-common_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rpc.netty.NettyRpcEnv.startServer(NettyRpcEnv.scala:119) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.rpc.netty.NettyRpcEnvFactory$$anonfun$4.apply(NettyRpcEnv.scala:465)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.rpc.netty.NettyRpcEnvFactory$$anonfun$4.apply(NettyRpcEnv.scala:464)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:2271)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) 
~[scala-library-2.11.8.jar:?]
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:2263) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rpc.netty.NettyRpcEnvFactory.create(NettyRpcEnv.scala:469) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:57) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:249) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:175) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:256) 
[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkContext.(SparkContext.scala:423) 
[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58) 
[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.hive.spark.client.RemoteDriver.(RemoteDriver.java:161) 
[hive-exec-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:536) 
[hive-exec-3.2.0-SNAPSHOT.jar:3.2.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_322]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_322]


The root cause of the problem is that we upgrade the netty library from 
4.1.17.Final to 4.1.69.Final. The upgraded library does not have 
`DEFAULT_TINY_CACHE_SIZE` field 
[here|https://github.com/netty/netty/blob/netty-4.1.51.Final/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L46]
 which was removed in 4.1.52.Final



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [EXTERNAL] Re: Branch-3 backports and build stability

2023-02-10 Thread Vihang Karajgaonkar
Thanks a lot Stamatis for starting this thread. I really appreciate all the
efforts to stabilize branch-3 to get it to a releasable state and I agree
that we should get it to a green state before opening it for PRs not
related to test failures. I can help with the effort as well.

If we want to get the branch back to green state soon, have we considered
disabling the tests which are clearly flaky? (e.g pass on some builds and
fail on the other build with no new code changes). If we don't do that, we
will keep playing whack a mole with those tests. I propose for such tests
we should disable them and create tickets to unflake them separately. This
will help us get back to a green state faster.

Hi Aman,
For TestMiniSparkOnYarnCliDriver failures, you probably should also look
into the spark driver/application logs and see if there are infrastructure
errors (e.g OOMs). Are these tests failing when you run locally?

Thanks,
Vihang

On Tue, Feb 7, 2023 at 10:05 PM Aman Raj 
wrote:

> +1,
> Thanks Stamatis and Lazlo for helping in the test case fixes till now.
>
> Team,
> I need help in fixing the following tests in Hive. I have tried different
> approaches but no luck till now.
> I am facing some issues in fixing the following tests :
> org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
>
> Issue :
> PREHOOK: Input: default@src
> PREHOOK: Output: default@src
> Failed to monitor Job[-1] with exception
> 'java.lang.IllegalStateException(Connection to remote Spark driver was
> lost)' Last known state = SENT
> Failed to execute spark task, with exception
> 'java.lang.IllegalStateException(RPC channel is closed.)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. RPC channel is closed.
>
> History :
> Initially the tests had failed with errors which I fixed in the following
> task : https://issues.apache.org/jira/browse/HIVE-26940
>
> Does anyone know what the issue is here ? There are 6-7 failures because
> of this test case. Link to the failed test cases for the stacktrace :
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3949/2/tests/
> Thanks,
> Aman.
>
> 
> From: László Bodor 
> Sent: Tuesday, February 7, 2023 4:46 PM
> To: dev@hive.apache.org 
> Subject: [EXTERNAL] Re: Branch-3 backports and build stability
>
> +1
> also, if I merged something that I thought was for test stability (but
> instead it was a feature), excuse me :)
> for reference, the whole green test initiative is tracked under this
> umbrella:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26836=05%7C01%7Crajaman%40microsoft.com%7Cc1cbb508eee74c3347e508db08fcdfef%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638113654431055909%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=Ztfbm5%2FjUJV5v083%2BFu5%2Fs7mqVBEgCgEBeo5BJFzS8A%3D=0
>
> Stamatis Zampetakis  ezt írta (időpont: 2023. febr. 7.,
> K, 12:09):
>
> > Hi all,
> >
> > The build in branch-3 is not yet green; there are ~25 test failures. It
> is
> > a common practice that we shouldn't push changes on top of a broken build
> > unless they are addressing test failures.
> >
> > Some people (mainly Aman Raj, Chris Nauroth, and Laszlo Bodor) are
> working
> > hard to stabilize the build for quite some time now. If you want to help
> > out then start by reviewing, merging, and fixing things around test
> > failures.
> >
> > It's not yet the time to bring new features, upgrades, bugs, etc., in
> > branch-3. I would encourage  committers to not approve such changes till
> we
> > get back to a stable branch.
> >
> > Best,
> > Stamatis
> >
>


[jira] [Created] (HIVE-27062) Disable flaky test TestRpc#testClientTimeout

2023-02-09 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27062:
--

 Summary: Disable flaky test TestRpc#testClientTimeout
 Key: HIVE-27062
 URL: https://issues.apache.org/jira/browse/HIVE-27062
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


TestRpc#testClientTimeout is flaky in branch-3. I don't see this test in the 
main branch, so I think we should disable this test to make sure branch-3 is 
green.

Failing run: 
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3989/6/tests/

Fails with the stack trace:

java.lang.AssertionError: Client should have failed to connect to server.
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout(TestRpc.java:308)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)

Passing run (on same commit):
http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3989/5/tests/

In my opinion the test is not deterministic because it makes some timing 
assumptions IIUC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27009) Support pluggable user token provider HiveMetastoreClient

2023-01-31 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27009:
--

 Summary: Support pluggable user token provider HiveMetastoreClient
 Key: HIVE-27009
 URL: https://issues.apache.org/jira/browse/HIVE-27009
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar


In HTTP mode, the HiveMetastoreClient can add a token based on the environment 
variable HMS_JWT. However, this approach is not very flexible because 
environment variables cannot be changed once set without restarting the JVM. It 
would be good to have a pluggable interface called 
HiveMetastoreUserTokenProvider which can provide token specific to the actual 
user session which is being used to connect to the HiveMetastore. If the user 
token provider is not available, we can fall back to using the environment 
variable HMS_JWT to keep backwards compatibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27001) Backport HIVE-26633 to branch-3

2023-01-30 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-27001:
--

 Summary: Backport HIVE-26633 to branch-3
 Key: HIVE-27001
 URL: https://issues.apache.org/jira/browse/HIVE-27001
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Vihang Karajgaonkar


HIVE-26633 fixes the maximum response size in metastore http mode. We should 
backport this to branch-3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26949) Backport HIVE-26071 to branch-3

2023-01-16 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-26949:
--

 Summary: Backport HIVE-26071 to branch-3
 Key: HIVE-26949
 URL: https://issues.apache.org/jira/browse/HIVE-26949
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Creating this ticket to backport HIVE-26071 to branch-3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26948) Backport HIVE-21456 to branch-3

2023-01-16 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-26948:
--

 Summary: Backport HIVE-21456 to branch-3
 Key: HIVE-26948
 URL: https://issues.apache.org/jira/browse/HIVE-26948
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, Standalone Metastore
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-21456 adds support to connect to Hive metastore over http transport. This 
is a very useful feature especially in cloud based environments. Creating this 
ticket to backport it to branch-3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-25796) Allow metastore clients to fetch remaining events if some of the events are cleaned up

2021-12-10 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-25796:
--

 Summary: Allow metastore clients to fetch remaining events if some 
of the events are cleaned up
 Key: HIVE-25796
 URL: https://issues.apache.org/jira/browse/HIVE-25796
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


This is the code snippet from HiveMetastoreClient.java's getNextNotification 
method

{noformat}
  for (NotificationEvent e : rsp.getEvents()) {
LOG.debug("Got event with id : {}", e.getEventId());
if (e.getEventId() != nextEventId) {
  if (e.getEventId() == prevEventId) {
LOG.error("NOTIFICATION_LOG table has multiple events with the same 
event Id {}. " +
"Something went wrong when inserting notification events.  
Bootstrap the system " +
"again to get back teh consistent replicated state.", 
prevEventId);
throw new 
IllegalStateException(REPL_EVENTS_WITH_DUPLICATE_ID_IN_METASTORE);
  } else {
LOG.error("Requested events are found missing in NOTIFICATION_LOG 
table. Expected: {}, Actual: {}. "
+ "Probably, cleaner would've cleaned it up. "
+ "Try setting higher value for 
hive.metastore.event.db.listener.timetolive. "
+ "Also, bootstrap the system again to get back the 
consistent replicated state.",
nextEventId, e.getEventId());
throw new IllegalStateException(REPL_EVENTS_MISSING_IN_METASTORE);
  }
}
{noformat}

Consider the case when a client which caches a event id and tries to fetch the 
next events since the eventid after long time. In this case, it is possible 
that Metastore has cleaned up the events because they were more than 24 hrs 
old. In such a case, this API throws an exception. It is possible that client 
does not care if the events are not in sequence and hence this exception should 
be thrown optionally depending on what the client wants.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25479) Browser SSO auth may fail intermittently on chrome browser in virtual environments

2021-08-24 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-25479:
--

 Summary: Browser SSO auth may fail intermittently on chrome 
browser in virtual environments
 Key: HIVE-25479
 URL: https://issues.apache.org/jira/browse/HIVE-25479
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


When browser based SSO is enabled the Hive JDBC driver might miss the POST 
requests coming from the browser which provide the one-time token issued by 
HS2s after the SAML flow completes. The issue was observed mostly in virtual 
environments on Windows.

The issue seems to be that when the driver binds to a port even though the port 
is in LISTEN state, if the browser issues posts request on the port before it 
goes into ACCEPT state the result is non-deterministic. On native OSes we 
observed that the connection is buffered and is received by the driver when it 
begins accepting the connections. In case of VMs it is observed that even 
though the connection is buffered and presented when the port goes into ACCEPT 
mode, the payload of the request or the connection itself is lost. This race 
condition causes the driver to wait for the browser until it timesout and the 
browser keeps waiting for a response from the driver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25281) Add optional fields to enable returning filemetadata for tables and partitions

2021-06-23 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-25281:
--

 Summary: Add optional fields to enable returning filemetadata for 
tables and partitions
 Key: HIVE-25281
 URL: https://issues.apache.org/jira/browse/HIVE-25281
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


The hive_metastore.thrift interface defines the fields for Table and Partition 
objects. Certain SQL engines like Impala use Table and Partition from the HMS 
and then augment it to include additional metadata useful for the engine itself 
e.g file metadata. It would be good to add support for such fields in the 
thrift definition itself. These fields currently will be optional fields so 
that HMS itself doesn't really need to support it for now, but this can be 
supported in future depending on which SQL engine is talking to HMS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24987) hive.metastore.disallow.incompatible.col.type.changes is too restrictive for some storage formats

2021-04-07 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24987:
--

 Summary: hive.metastore.disallow.incompatible.col.type.changes is 
too restrictive for some storage formats
 Key: HIVE-24987
 URL: https://issues.apache.org/jira/browse/HIVE-24987
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar


Currently when {{hive.metastore.disallow.incompatible.col.type.changes}} is set 
to true it disallows any schema changes which are deemed as backwards 
incompatible e.g dropping a column of a table. While this may be a correct 
thing to do for Parquet or Orc tables, it is too restrictive for storage 
formats like Kudu. 

Currently, for Kudu tables, Impala supports dropping a column. But if we set 
this config to true metastore disallows changing the schema of the metastore 
table. I am assuming this would be problematic for Iceberg tables too which 
supports such schema changes.

The proposal is to have a new configuration which provided a exclusion list of 
the table fileformat where this check will be skipped. Currently, we will only 
include Kudu tables to skip this check.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24899) create database event does not include managedLocation URI

2021-03-17 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24899:
--

 Summary: create database event does not include managedLocation URI
 Key: HIVE-24899
 URL: https://issues.apache.org/jira/browse/HIVE-24899
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


I noticed that when a database is created, Metastore generated Notification 
event for the database doesn't have the managed location set. If I do a 
getDatabase call later, metastore returns the managedLocationUri. This seems 
like a inconsistency and it would be good if the generated event includes the 
managedLocationUri as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Contributions from dataproc-metastore

2021-02-08 Thread Vihang Karajgaonkar
+1 to having this in Hive repo. Or easier, we can just add a section in the
README.md at the top level.

On Fri, Feb 5, 2021 at 1:54 AM Stamatis Zampetakis 
wrote:

> I think https://cwiki.apache.org/confluence/display/Hive/HowToContribute
> is
> also relevant for this kind of info.
>
> How about moving this kind of guidelines to the Hive repo as
> contributing.md so that it is visible to everybody creating/merging pull
> requests?
>
> Best,
> Stamatis
>
> On Fri, Feb 5, 2021 at 10:40 AM Zoltan Haindrich  wrote:
>
> > Hey All!
> >
> > Thank you Stamatis for providing those pointers - I also had in mind the
> > icla stuff and that it will really become a challenge to identify who is
> > the real contributor :D
> >
> > Cameron: thank you for your understanding - I'm happy that you and Zhou
> > are contributing to the project! But it's important to be able to
> identify
> > the individual for the
> > contributions they make.
> >
> > Vihang: Its great to know about that they are contributing those
> > improvements, I haven't seen this doc before! I totally agree that we
> > should improve on our documentation -
> > I've just taken a look and not sure where it should be extended - I'll
> > keep looking :)
> >
> > cheers,
> > Zoltan
> >
> >
> > On 2/5/21 12:31 AM, Stamatis Zampetakis wrote:
> > > Apache requires signing an ICLA [1] for committers and clear intention
> of
> > > contributing from contributors [2].
> > >  From the above, I would say that it is important to know who
> > (individual)
> > > is the one contributing the code and Zoltan did well to raise awareness
> > > around this topic.
> > > Of course, not everyone is familiar with these processes so as Vihang
> > > pointed out it would be good to improve the documentation and point
> > people
> > > to that when necessary.
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1] https://www.apache.org/licenses/icla.pdf
> > > [2] https://apetro.ghost.io/apache-contributors-no-cla/
> > >
> > > On Thu, Feb 4, 2021 at 9:12 PM Vihang Karajgaonkar <
> vihan...@apache.org>
> > > wrote:
> > >
> > >> Thanks Zoltan for your email.
> > >>
> > >> Just to give some context, dataproc-metastore is Google's metastore
> > >> compatible cloud service. The good news is that they are happy and
> > willing
> > >> to contribute any improvements/fixes to Apache Hive (metastore
> > >> specifically) instead of forking out the repository.
> > >> They also contributed their proposed changes here:
> > >>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869886
> > >>
> > >> I think it makes sense to have individual users contribute the PR so
> > that
> > >> we can attribute the patch accordingly. When I merged their PR I asked
> > them
> > >> offline who is the end user for this PR and they mentioned they are
> > still
> > >> figuring out who is going to be the point of contact for the
> open-source
> > >> contributions. While merging the PR, github suggested the author name
> > and I
> > >> used that.
> > >>
> > >> I was a bit angry because of the above; so I've closed it.
> > >>>
> > >> I feel this is a bit against the spirit of open-source hive and it
> > would be
> > >> great to have a wiki page for commit guidelines and ask them to refer
> to
> > >> it. The only wiki that I find about commit guidelines is
> > >> https://cwiki.apache.org/confluence/display/Hive/HowToCommit which
> > >> definitely needs an update.
> > >>
> > >> On Thu, Feb 4, 2021 at 1:02 AM Zoltan Haindrich  wrote:
> > >>
> > >>> Hey All!
> > >>>
> > >>> It seems to me that someone have opened a "dataproc-metastore"
> account
> > on
> > >>> github and is contributing to Hive thru that user.
> > >>> I personally don't like that the account is not a real person - it
> > looks
> > >>> more like a team or group inside Google.
> > >>>
> > >>> This account already has a commit which is very confusing:
> > >>> * the github account is https://github.com/dataproc-metastore
> > >>> * the jira is assigned to Cameron Moberg
> > >>> https://issues.apache.org/jira/browse/HIVE-24470
> > >>> * the actual commits in the PR were made by Zhou Fang
> > >>> https://github.com/coufon
> > >>> * the commit is attributed to "Zhou Fang" -
> > >>>
> > >>
> >
> https://github.com/apache/hive/commit/b0309b7f023d9785c3a842d70d0fc471252101bf
> > >>> * the jira is still open...but that's not really relevant - that can
> be
> > >>> fixed in no time :D
> > >>>
> > >>> I think we should stop merging PRs from sources like this (or is it
> too
> > >>> much to ask that the user should have a matching github account)?
> > >>>
> > >>> This "dataproc-metastore" user had one more PR open - I was a bit
> angry
> > >>> because of the above; so I've closed it.
> > >>>
> > >>> Let me know what you think!
> > >>>
> > >>> cheers,
> > >>> Zoltan
> > >>>
> > >>
> > >
> >
>


Re: Contributions from dataproc-metastore

2021-02-04 Thread Vihang Karajgaonkar
Thanks Zoltan for your email.

Just to give some context, dataproc-metastore is Google's metastore
compatible cloud service. The good news is that they are happy and willing
to contribute any improvements/fixes to Apache Hive (metastore
specifically) instead of forking out the repository.
They also contributed their proposed changes here:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869886

I think it makes sense to have individual users contribute the PR so that
we can attribute the patch accordingly. When I merged their PR I asked them
offline who is the end user for this PR and they mentioned they are still
figuring out who is going to be the point of contact for the open-source
contributions. While merging the PR, github suggested the author name and I
used that.

I was a bit angry because of the above; so I've closed it.
>
I feel this is a bit against the spirit of open-source hive and it would be
great to have a wiki page for commit guidelines and ask them to refer to
it. The only wiki that I find about commit guidelines is
https://cwiki.apache.org/confluence/display/Hive/HowToCommit which
definitely needs an update.

On Thu, Feb 4, 2021 at 1:02 AM Zoltan Haindrich  wrote:

> Hey All!
>
> It seems to me that someone have opened a "dataproc-metastore" account on
> github and is contributing to Hive thru that user.
> I personally don't like that the account is not a real person - it looks
> more like a team or group inside Google.
>
> This account already has a commit which is very confusing:
> * the github account is https://github.com/dataproc-metastore
> * the jira is assigned to Cameron Moberg
> https://issues.apache.org/jira/browse/HIVE-24470
> * the actual commits in the PR were made by Zhou Fang
> https://github.com/coufon
> * the commit is attributed to "Zhou Fang" -
> https://github.com/apache/hive/commit/b0309b7f023d9785c3a842d70d0fc471252101bf
> * the jira is still open...but that's not really relevant - that can be
> fixed in no time :D
>
> I think we should stop merging PRs from sources like this (or is it too
> much to ask that the user should have a matching github account)?
>
> This "dataproc-metastore" user had one more PR open - I was a bit angry
> because of the above; so I've closed it.
>
> Let me know what you think!
>
> cheers,
> Zoltan
>


[jira] [Created] (HIVE-24741) get_partitions_ps_with_auth performance can be improved when it is requesting all the partitions

2021-02-04 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24741:
--

 Summary: get_partitions_ps_with_auth performance can be improved 
when it is requesting all the partitions
 Key: HIVE-24741
 URL: https://issues.apache.org/jira/browse/HIVE-24741
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


{{get_partitions_ps_with_auth}} API does not support DirectSQL. I have seen 
some large production use-cases where this API (specifically from Spark 
applications) is used heavily to request for all the partitions of a table. 
This performance of this API when requesting all the partitions of the table 
can be signficantly improved (~4 times from a realworld large workload usecase) 
if we forward this API call to a directSQL enabled API. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24732) CachedStore does not return the fields which are auto-generated by the database

2021-02-03 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24732:
--

 Summary: CachedStore does not return the fields which are 
auto-generated by the database
 Key: HIVE-24732
 URL: https://issues.apache.org/jira/browse/HIVE-24732
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


It looks like CachedStore directly caches the thrift objects as they are sent 
by the client. The general pattern seems to be similar to below:

{noformat}
  @Override public void createTable(Table tbl) throws InvalidObjectException, 
MetaException {
rawStore.createTable(tbl);
// in case of event based cache update, cache will be updated during commit.
if (canUseEvents) {
  return;
}
String catName = normalizeIdentifier(tbl.getCatName());
String dbName = normalizeIdentifier(tbl.getDbName());
String tblName = normalizeIdentifier(tbl.getTableName());
if (!shouldCacheTable(catName, dbName, tblName)) {
  return;
}
validateTableType(tbl);
// TODO in case of CachedStore we cache directly the object send by the 
client.
// this is problematic since certain fields of the object are populated
// after it is persisted. The cache will not be able to serve those fields 
correctly.
sharedCache.addTableToCache(catName, dbName, tblName, tbl);
  }
{noformat}

The problem here is that the table id is generated when the table is persisted 
in the database. The cachedStore will cache the Table object whose id will be 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24619) Exclude unnecessary dependencies from pac4j

2021-01-11 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24619:
--

 Summary: Exclude unnecessary dependencies from pac4j
 Key: HIVE-24619
 URL: https://issues.apache.org/jira/browse/HIVE-24619
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-24543 introduces pac4j dependency which pulls in multiple other 
dependencies. It would be great to exclude as many dependencies as possible. 
This JIRA is used to track this effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24594) results_cache_invalidation2.q is flaky

2021-01-06 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24594:
--

 Summary: results_cache_invalidation2.q is flaky
 Key: HIVE-24594
 URL: https://issues.apache.org/jira/browse/HIVE-24594
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


results_cache_invalidation2.q failed for me couple of times on a unrelated PR. 
Here is the error log.

{noformat}
---
Test set: org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver
---
Tests run: 90, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 450.54 s <<< 
FAILURE! - in org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver
org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
  Time elapsed: 15.087 s  <<< FAILURE!
java.lang.AssertionError:
Client Execution succeeded but contained differences (error code = 1) after 
executing results_cache_invalidation2.q ^M
266a267
>  A masked pattern was here 
271a273
>  A masked pattern was here 
273c275,276
<   Stage-0 is a root stage
---
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
275a279,365
>   Stage: Stage-1
> Tez
>  A masked pattern was here 
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)
>  A masked pattern was here 
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: tab1
>   filterExpr: key is not null (type: boolean)
>   Statistics: Num rows: 1500 Data size: 130500 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: key is not null (type: boolean)
> Statistics: Num rows: 1500 Data size: 130500 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Select Operator
>   expressions: key (type: string)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1500 Data size: 130500 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: string)
> null sort order: z
> sort order: +
> Map-reduce partition columns: _col0 (type: string)
> Statistics: Num rows: 1500 Data size: 130500 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map 4
> Map Operator Tree:
> TableScan
>   alias: tab2
>   filterExpr: key is not null (type: boolean)
>   Statistics: Num rows: 500 Data size: 43500 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Fil^M
{noformat}

The test works for me locally. In fact the same PR had a successful run of this 
test in a previous commit. I think we should disable this and re-enable it 
after fixing the flakiness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24562) Deflake TestHivePrivilegeObjectOwnerNameAndType

2020-12-22 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24562:
--

 Summary: Deflake TestHivePrivilegeObjectOwnerNameAndType
 Key: HIVE-24562
 URL: https://issues.apache.org/jira/browse/HIVE-24562
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


One of my unrelated PRs fails this test 
{{TestHivePrivilegeObjectOwnerNameAndType}}. The exception which I see in the 
logs is below:

{noformat}
Caused by: ERROR 42X05: Table/View 'TXN_LOCK_TBL' does not exist.
at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source)
at org.apache.derby.iapi.error.StandardException.newException(Unknown 
Source)
at 
org.apache.derby.impl.sql.compile.LockTableNode.bindStatement(Unknown Source)
at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown Source)
at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source)
at 
org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown
 Source)
... 73 more
)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.openTxns(TxnHandler.java:651)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.open_txns(HiveMetaStore.java:8301)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy46.open_txns(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openTxnsIntr(HiveMetaStoreClient.java:3634)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openTxn(HiveMetaStoreClient.java:3595)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:218)
at com.sun.proxy.$Proxy47.openTxn(Unknown Source)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.openTxn(DbTxnManager.java:243)
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.openTxn(DbTxnManager.java:227)
at org.apache.hadoop.hive.ql.Compiler.openTransaction(Compiler.java:268)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:215)

at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:492)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:445)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:178)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:150)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:137)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHivePrivilegeObjectOwnerNameAndType.runCmd(TestHivePrivilegeObjectOwnerNameAndType.java:86)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHivePrivilegeObjectOwnerNameAndType.beforeTest(TestHivePrivilegeObjectOwnerNameAndType.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at 
org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43

[jira] [Created] (HIVE-24561) Deflake TestCachedStoreUpdateUsingEvents

2020-12-22 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24561:
--

 Summary: Deflake TestCachedStoreUpdateUsingEvents
 Key: HIVE-24561
 URL: https://issues.apache.org/jira/browse/HIVE-24561
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


TestCachedStoreUpdateUsingEvents seems to use "file:/tmp" as the table and 
database directory. The cleanUp method will clean all the sub-directories 
directories in /tmp which can be error prone.

Also noticed that I see a lot NPEs from {{SharedCache#getMemorySizeEstimator}} 
because the {{sizeEstimators}} field is null. We should add a null check for 
that field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24543) Support SAML 2.0 as an authentication mechanism

2020-12-15 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-24543:
--

 Summary: Support SAML 2.0 as an authentication mechanism
 Key: HIVE-24543
 URL: https://issues.apache.org/jira/browse/HIVE-24543
 Project: Hive
  Issue Type: New Feature
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


With cloud based deployments, having a SAML 2.0 based authentication support in 
HS2 will be greatly useful in case of federated or external identity providers 
like Okta, PingIdentity or Azure AD.

This authentication mechanism can initially be only supported on http transport 
mode in HiveServer2 since the SAML 2.0 protocol is primarily designed for web 
clients.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23971) Cleanup unreleased method signatures in IMetastoreClient

2020-07-31 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-23971:
--

 Summary: Cleanup unreleased method signatures in IMetastoreClient
 Key: HIVE-23971
 URL: https://issues.apache.org/jira/browse/HIVE-23971
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


There are many methods in IMetastoreClient which are simply wrappers around 
another method. The code has become very intertwined and needs some cleanup. 
For instance, I see the following variations of {{getPartitionsByNames}} in 
{{IMetastoreClient}} 

{noformat}

List getPartitionsByNames(String db_name, String tbl_name, 
List part_names, boolean getColStats, String engine)

List getPartitionsByNames(String catName, String db_name, String 
tbl_name, List part_names)

List getPartitionsByNames(String catName, String db_name, String 
tbl_name, List part_names, boolean getColStats, String engine)
{noformat}

The problem seems be that every time a new field is added to the request object 
{{GetPartitionsByNamesRequest}} and new variant is introduced in 
IMetastoreClient. Many of these methods are not released yet and it would be 
good to clean them up by using the request object as method argument instead of 
individual fields. Once we release we will not be able to change the method 
signatures since we annotate IMetastoreClient as public API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS]: Commit guidelines for PRs

2020-06-30 Thread Vihang Karajgaonkar
Thanks to all who worked on the new testing infrastructure. It definitely
looks like a step up from the older test infrastructure.

I wanted to know if there are any new guidelines for a committer for
merging the PRs. Earlier we used to create one patch file for each JIRA and
push it to the master branch. With PRs it is possible that a
contributor publishes multiple commits (eg. to address review comments). I
would like to start a discussion on what should be the guidelines on
merging the PR requests?

Most of you are probably already following it but it would be good to
formalize the following:

1. Whether to standardize on Squash into one commit

for the PR?
2. What are the commit message guidelines? Our project has unfortunately
not been great in documenting the commit message appropriately. Current
guidelines are to have one line commit message and the JIRA is expected to
have more detailed information. However, most of the time the JIRAs  don't
have enough information. I think it would be good to add a few lines of
description as part of the git commit message. Some projects recommend 50/72
formatting

for
the git commit message which I feel is nice.
3. Do committers merge the PR directly from the github? I am not sure if
there is a way for our committer credentials to be integrated in github.
Otherwise, the other option could be that the committer checks out the PR
and merges it manually into the master branch.

Thanks,
Vihang


[jira] [Created] (HIVE-23785) Database should have a unique id

2020-06-30 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-23785:
--

 Summary: Database should have a unique id
 Key: HIVE-23785
 URL: https://issues.apache.org/jira/browse/HIVE-23785
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-20556 introduced a id field to the Table object. This is a useful 
information since a table which is dropped and recreated with the same name 
will have a different Id. If a HMS client is caching such table object, it can 
be used to determine if the table which is present on the client-side matches 
with the one in the HMS.

We can expand this idea to other HMS objects like Database, Catalogs and 
Partitions and add a new id field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23348) Add API timing metric for fire_listener_event API

2020-04-30 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-23348:
--

 Summary: Add API timing metric for fire_listener_event API
 Key: HIVE-23348
 URL: https://issues.apache.org/jira/browse/HIVE-23348
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Currently metastore does not have any metric to report the time taken to 
execute {{fire_listener_event}} API. It would be useful to add such a metric.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23116) get_partition_with_specs does not use filter spec when projection is empty

2020-03-31 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-23116:
--

 Summary: get_partition_with_specs does not use filter spec when 
projection is empty
 Key: HIVE-23116
 URL: https://issues.apache.org/jira/browse/HIVE-23116
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


The API implementation ignores the filter spec if the project spec is empty as 
seen here 
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L3903]

if (fieldList == null || fieldList.isEmpty()) {
  // no fields are requested. Fallback to regular getPartitions 
implementation to return all the fields
  return getPartitionsInternal(table.getCatName(), table.getDbName(), 
table.getTableName(), -1,
  true, true);
}

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23018) Provide a bulk API to fire multiple listener events

2020-03-12 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-23018:
--

 Summary: Provide a bulk API to fire multiple listener events
 Key: HIVE-23018
 URL: https://issues.apache.org/jira/browse/HIVE-23018
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Metastore provides a API to fire a listener event (currently only supports 
INSERT event). The problem with that API is that it only takes in one partition 
at a time. A typical query may insert data into multiple partitions at a time. 
In such a case query engines like HS2 or Impala will have to issue multiple 
RPCs to metastore sequentially to fire these events. This can show up as a 
slowdown to the user if the query engines do not return the prompt to the user 
until all the events are fired (In case of HS2 and Impala). It would be great 
if we have bulk API which takes in multiple partitions for a table so that 
metastore can generate many such events in one RPC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22126) hive-exec packaging should shade guava

2019-08-19 Thread Vihang Karajgaonkar (Jira)
Vihang Karajgaonkar created HIVE-22126:
--

 Summary: hive-exec packaging should shade guava
 Key: HIVE-22126
 URL: https://issues.apache.org/jira/browse/HIVE-22126
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


The ql/pom.xml includes complete guava library into hive-exec.jar 
https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
problems for downstream clients of hive which have hive-exec.jar in their 
classpath since they are pinned to the same guava version as that of hive. 

We should shade guava classes so that other components which depend on 
hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HIVE-22100) Hive generates a add partition event with empty partition list

2019-08-12 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-22100:
--

 Summary: Hive generates a add partition event with empty partition 
list
 Key: HIVE-22100
 URL: https://issues.apache.org/jira/browse/HIVE-22100
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


If the user issues a {{alter table  add if not exists partition 
}} and it the partition already exists, no partition is added. 
However, metastore still generates a {{ADD_PARTITION}} event with empty 
partition list. An {{alter table  drop if exists partition 
}} does not generate the {{DROP_PARTITION}} event in case the 
partition is not existing.

This behavior is inconsistent and misleading. Metastore should not generate 
such add_partition events.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-21932) IndexOutOfRangeExeption in FileChksumIterator

2019-06-27 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21932:
--

 Summary: IndexOutOfRangeExeption in FileChksumIterator
 Key: HIVE-21932
 URL: https://issues.apache.org/jira/browse/HIVE-21932
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


According to definition of {{InsertEventRequestData}} in 
{{hive_metastore.thrift}} the {{filesAddedChecksum}} is a optional field. But 
the FileChksumIterator does not handle it correctly when a client fires a 
insert event which does not have file checksums. The issue is that 
{{InsertEvent}} class initializes fileChecksums list to a empty arrayList to 
the following check will never come into play

{noformat}
result = ReplChangeManager.encodeFileUri(files.get(i), chksums != null ? 
chksums.get(i) : null,
subDirs != null ? subDirs.get(i) : null);
{noformat}

The chksums check above should include a {{!chksums.isEmpty()}} check as well 
in the above line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21851) FireEventListenerResponse should include event id when available

2019-06-07 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21851:
--

 Summary: FireEventListenerResponse should include event id when 
available
 Key: HIVE-21851
 URL: https://issues.apache.org/jira/browse/HIVE-21851
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


The metastore API {{fire_listener_event}} gives clients the ability to fire a 
INSERT event on DML operations. However, the returned response is empty struct. 
It would be useful to sent back the event id information in the response so 
that clients can take actions based of the event id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21617) Flaky test : TestMiniSparkOnYarnCliDriver.testCliDriver[truncate_column_buckets]

2019-04-15 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21617:
--

 Summary: Flaky test : 
TestMiniSparkOnYarnCliDriver.testCliDriver[truncate_column_buckets]
 Key: HIVE-21617
 URL: https://issues.apache.org/jira/browse/HIVE-21617
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar


We should disable this test. Was seen in 
https://builds.apache.org/job/PreCommit-HIVE-Build/16961/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21596:
--

 Summary: HiveMetastoreClient should be able to connect to older 
metastore servers
 Key: HIVE-21596
 URL: https://issues.apache.org/jira/browse/HIVE-21596
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


{{HiveMetastoreClient}} currently depends on the fact that both the client and 
server versions are the same. Additionally, since the server APIs are backwards 
compatible, it is possible for a older client (eg. 2.1.0 client version) to 
connect to a newer server (eg. 3.1.0 server version) without any issues. This 
is useful in setups where HMS is deployed in a remote mode and clients connect 
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can 
connect to the a newer server version. When a newer client is talking to a 
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which 
should be automatically be handled by the clients since each API throws 
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize 
the field in the first place since it does not know about that field id. So the 
wire-compatibility exists already. However, the client side application should 
understand the implications of such a behavior. In such cases, it would be 
better for the client to throw exception by checking the server version which 
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using 
newer thrift API the client will start seeing exception {{Invalid method name}}
This can be handled on the client side by making sure that the newer 
implementation is conditional to the server version. Which means client should 
check the server version and invoke the new implementation only if the server 
version supports the newer API. (On a side note, it would be great if metastore 
also gives information of which APIs are supported for a given version)




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21595) HIVE-20556 breaks backwards compatibility

2019-04-09 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21595:
--

 Summary: HIVE-20556 breaks backwards compatibility
 Key: HIVE-21595
 URL: https://issues.apache.org/jira/browse/HIVE-21595
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-20556 exposes a new field Table definition. However, it changes the order 
of the field ids which breaks backwards wire-compatibility. Any older client 
which is connects with HMS will not be able to deserialize table objects 
correctly since the field ids are different on client and server side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21586) Thrift generated cpp files for metastore do not compile

2019-04-05 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21586:
--

 Summary: Thrift generated cpp files for metastore do not compile
 Key: HIVE-21586
 URL: https://issues.apache.org/jira/browse/HIVE-21586
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


The way some structs like CreationMetadata, CompactionInfo, ColumnStatistics 
are defined in hive_metastore.thrift is that these structs are used before they 
are defined. While this works for the java code which is generated, it does not 
work for the generated cpp code since Thrift does not use pointer/references to 
the forward declared classes.

The easy fix for this would be to reorder the struct definitions in the 
hive_metastore.thrift so that they are always defined before they are used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21535) Re-enable TestCliDriver#vector_groupby_reduce

2019-03-28 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21535:
--

 Summary: Re-enable TestCliDriver#vector_groupby_reduce
 Key: HIVE-21535
 URL: https://issues.apache.org/jira/browse/HIVE-21535
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Vihang Karajgaonkar


The test was disabled since it was flaky in HIVE-21396. Creating this JIRA to 
re-enable the test by fixing the rounding logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21534) Flaky test : TestActivePassiveHA

2019-03-28 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21534:
--

 Summary: Flaky test : TestActivePassiveHA
 Key: HIVE-21534
 URL: https://issues.apache.org/jira/browse/HIVE-21534
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar


Failed in 
https://issues.apache.org/jira/browse/HIVE-21484?focusedCommentId=16798031=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16798031

Works locally as well in the subsequent run of precommit later in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21484) Metastore API getVersion() should return real version

2019-03-20 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21484:
--

 Summary: Metastore API getVersion() should return real version
 Key: HIVE-21484
 URL: https://issues.apache.org/jira/browse/HIVE-21484
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Currently I see the {{getVersion}} implementation in the metastore is returning 
a hard-coded "3.0". It would be good to return the real version of the 
metastore server using {{HiveversionInfo}} so that clients can take certain 
actions based on metastore server versions.

Possible use-cases are:
1. Client A can make use of new features introduced in given Metastore version 
else stick to the base functionality.
2. This version number  can be used to do a version handshake between client 
and server in the future to improve our cross-version compatibity story.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21203) Add builder classes for the API and add a metastore client side implementation

2019-02-01 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21203:
--

 Summary: Add builder classes for the API and add a metastore 
client side implementation
 Key: HIVE-21203
 URL: https://issues.apache.org/jira/browse/HIVE-21203
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Adding builder classes for clients to use this API would make it more 
user-friendly. Also, we should add a client side API which uses this newly 
added API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21180) Fix branch-3 metastore test timeouts

2019-01-29 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21180:
--

 Summary: Fix branch-3 metastore test timeouts
 Key: HIVE-21180
 URL: https://issues.apache.org/jira/browse/HIVE-21180
 Project: Hive
  Issue Type: Test
Affects Versions: 3.2.0
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


The module name below is wrong since metastore-server doesn't exist on 
branch-3. This is most likely the reason why test batches are timing out on 
branch-3

{noformat}
2019-01-29 00:32:17,765  INFO [HostExecutor 3] 
HostExecutor.executeTestBatch:262 Drone [user=hiveptest, host=104.198.216.224, 
instance=0] executing UnitTestBatch 
[name=228_UTBatch_standalone-metastore__metastore-server_20_tests, id=228, 
moduleName=standalone-metastore/metastore-server, batchSize=20, 
isParallel=true, testList=[TestPartitionManagement, 
TestCatalogNonDefaultClient, TestCatalogOldClient, TestHiveAlterHandler, 
TestTxnHandlerNegative, TestTxnUtils, TestFilterHooks, TestRawStoreProxy, 
TestLockRequestBuilder, TestHiveMetastoreCli, TestCheckConstraint, 
TestAddPartitions, TestListPartitions, TestFunctions, TestGetTableMeta, 
TestTablesCreateDropAlterTruncate, TestRuntimeStats, TestDropPartitions, 
TestTablesList, TestUniqueConstraint]] with bash 
/home/hiveptest/104.198.216.224-hiveptest-0/scratch/hiveptest-228_UTBatch_standalone-metastore__metastore-server_20_tests.sh
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21168) Fix TestSchemaToolCatalogOps

2019-01-25 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21168:
--

 Summary: Fix TestSchemaToolCatalogOps
 Key: HIVE-21168
 URL: https://issues.apache.org/jira/browse/HIVE-21168
 Project: Hive
  Issue Type: Test
Affects Versions: 3.2.0
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-21077 causes TestSchemaToolCatalogOps to fail on branch-3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69834: HIVE-21083: Removed the truststore location property requirement and removed the warnings on the truststore password property

2019-01-25 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69834/#review212345
---


Fix it, then Ship it!




Some minor comments related to logs. Rest looks good.


standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 472 (patched)
<https://reviews.apache.org/r/69834/#comment298112>

Nit, It think it would be useful to say "Defaults to jssecacerts, if it 
exists, otherwise uses cacerts"



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 362 (patched)
<https://reviews.apache.org/r/69834/#comment298114>

May be its useful to specify what is the default. So a message like ".. has 
not been set. Defaulting to jssecacerts, if it exists. Otherwise, cacerts."



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 369 (original), 371 (patched)
<https://reviews.apache.org/r/69834/#comment298115>

nit, instead of defaulting to default .. may be just say Using default Java 
truststore password.


- Vihang Karajgaonkar


On Jan. 25, 2019, 7:22 p.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69834/
> ---
> 
> (Updated Jan. 25, 2019, 7:22 p.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Na Li, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-21083
> https://issues.apache.org/jira/browse/HIVE-21083
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It was identified that a valid way of configuring TLS is by using the Java 
> default truststore and directly adding the trusted certificates to it. The 
> previous HMS implementation did not support this.
> 
> Modified the TLS properties in the following ways:
>  - Removed the requirement for metastore.dbaccess.ssl.truststore.path. If the 
> user does not specify a custom one, then it will default to the Java 
> truststore.
>  - Removed the logs / warnings on metastore.dbaccess.ssl.truststore.password. 
> This used to generate a lot of noise if the user did not provide one. Also, 
> the contents of the truststore is certificates, which is public information 
> and doesn't require strict security.
>  - Removed the unit test that checks for an empty truststore path.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  75f0c0a356f3b894408aa54b9cce5220d47d7f26 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9f721243c94d48eef35acdcbd0c2e143ab6d23ec 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  29738ba19b0d5ed9ec224d2288c0c1c922d0674c 
> 
> 
> Diff: https://reviews.apache.org/r/69834/diff/3/
> 
> 
> Testing
> ---
> 
> - Existing unit test coverage
> - Manual testing by verifying that these properties can configure TLS to a 
> MySQL DB
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



[jira] [Created] (HIVE-21155) create_time datatype for database and catalog should be int in sql server schema scripts

2019-01-23 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21155:
--

 Summary: create_time datatype for database and catalog should be 
int in sql server schema scripts
 Key: HIVE-21155
 URL: https://issues.apache.org/jira/browse/HIVE-21155
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


HIVE-21077 added create_time field to database and catalogs. However, the data 
type of this field was set to bigint instead of int like in case of other 
create_time fields for tbls and partitions. We should change it to int from 
bigint to be consistent with other create_time fields.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69585: HIVE-20776: Run HMS filterHooks on server-side in addition to client-side

2019-01-18 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69585/#review212147
---


Fix it, then Ship it!




Overall the patch looks good. I have some minor suggestions below. RLGTM.


standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FilterUtils.java
Lines 36 (patched)
<https://reviews.apache.org/r/69585/#comment297790>

please add documentation for each method



standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FilterUtils.java
Lines 236 (patched)
<https://reviews.apache.org/r/69585/#comment297793>

more descriptive message would be dbName is null. Same for line 251 tblName 
is null.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 628 (patched)
<https://reviews.apache.org/r/69585/#comment297791>

nit, could be simplified as 
filterHook = isServerFilterEnabled ? loadFilterHooks() : null;



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 668-671 (patched)
<https://reviews.apache.org/r/69585/#comment297792>

Isn't this redundant since you already checked for a valid configuration in 
getIfServerFilterenabled() method? A easier way would be to add 

Preconditions.checkState(!isBlank(MetastoreConf.getVar(conf, 
ConfVars.FILTER_HOOK)));



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 4731 (patched)
<https://reviews.apache.org/r/69585/#comment297795>

nit, remove "For improved performance". I am not very convinced that this 
is helping the performance. its okay to say "we'll check if the said db and 
table are to be filtered out, if so, then we won't proceed with querying the 
partitions."



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 4873 (patched)
<https://reviews.apache.org/r/69585/#comment297794>

nit, remove "For improved performance". I am not very convinced that this 
is helping the performance. its okay to say "we'll check if the said db and 
table are to be filtered out, if so, then we won't proceed with querying the 
partitions."



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 5682 (patched)
<https://reviews.apache.org/r/69585/#comment297796>

nit, remove "For improved performance". I am not very convinced that this 
is helping the performance. its okay to say "we'll check if the said db and 
table are to be filtered out, if so, then we won't proceed with querying the 
partitions."

Same comment for other places with this comment.



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetastoreFilterHook.java
Lines 52 (patched)
<https://reviews.apache.org/r/69585/#comment297798>

If this test class added more test coverage to TestFilterHooks test, I 
would suggest to move new tests in TestFilterHooks instead of removing 
TestFilterHooks and adding a new test class which copies all the code from 
TestFilterHooks. That way you don't lose the git history of TestFilterHooks 
unnecessarily.


- Vihang Karajgaonkar


On Jan. 18, 2019, 3:45 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69585/
> ---
> 
> (Updated Jan. 18, 2019, 3:45 p.m.)
> 
> 
> Review request for hive, Adam Holley, Karthick Sankarachary, Morio 
> Ramdenbourg, Peter Vary, Sergio Pena, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-20776
> https://issues.apache.org/jira/browse/HIVE-20776
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> add filtering to read result at HMS server, so user cannot see metadata 
> he/she has no privileges. Filtering is enabled/disabled based on 
> configuration.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  19bd9bac84c20f94ac819a80e3cc89e0dc66396d 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  be1f8c78497fe3d0816ad3935ba07cd5ad379b08 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FilterUtils.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.j

[jira] [Created] (HIVE-21131) Document some of the static util methods in MetastoreUtils

2019-01-17 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21131:
--

 Summary: Document some of the static util methods in MetastoreUtils
 Key: HIVE-21131
 URL: https://issues.apache.org/jira/browse/HIVE-21131
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


{{MetastoreUtils}} has some methods like {{makePartNameMatcher}} which could 
use some javadoc 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21128) hive.version.shortname should be 3.2 on branch-3

2019-01-16 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21128:
--

 Summary: hive.version.shortname should be 3.2 on branch-3
 Key: HIVE-21128
 URL: https://issues.apache.org/jira/browse/HIVE-21128
 Project: Hive
  Issue Type: Bug
Reporter: Vihang Karajgaonkar


Since 3.1.0 is already release, the {{hive.version.shortname}} property in the 
pom.xml of standalone-metastore should be 3.2.0. This version shortname is used 
to generate the metastore schema version and used by Schematool to initialize 
the schema using the correct script. Currently it using 3.1.0 schema init 
script instead of 3.2.0 init script



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-14 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

(Updated Jan. 15, 2019, midnight)


Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Changes
---

added suggested change by Bharath


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description
---

HIVE-21077 : Database and Catalogs should have creation time


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
 3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 13e287e352bdbfe5263b058e1b430af8613fe815 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 8f149d1d6e2a5b9571eeef3c05d68834e4035172 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 9e5f0860f2b0e8caa9abf213e2a2c91b8e16d985 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
9576f8775a4a8a314e09462cbaaaeaebd3b4921f 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
 e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 bb20d9f42a855100397140f9e018c04c5f61dde7 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
 28eb1fadca80dfd3c962e4163120b83f00410c4a 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/TestSchemaToolForMetastore.java
 c2eb6c9e22a22f09cc1d2cc6394aa4e0e339b63a 


Diff: https://reviews.apache.org/r/69664/diff/7/

Changes: https://reviews.apache.org/r/69664/diff/6-7/


Testing
---

Ran ITests to check the db install and upgrade scripts are working for mysql, 
postgres, oracle and derby databases. The ITests for mssql is timing out for 
some reason due to container provisioning issues.


Thanks,

Vihang Karajgaonkar



[jira] [Created] (HIVE-21115) Add support for object versions in metastore

2019-01-10 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21115:
--

 Summary: Add support for object versions in metastore
 Key: HIVE-21115
 URL: https://issues.apache.org/jira/browse/HIVE-21115
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar


Currently, metastore objects are identified uniquely by their names (eg. 
catName, dbName and tblName for a table is unique). Once a table or partition 
is created it could be altered in many ways. There is no good way currently to 
identify the version of the object once it is altered. For example, suppose 
there are two clients (Hive and Impala) using the same metastore. Once some 
alter operations are performed by a client, another client which wants to do a 
alter operation has no good way to know if the object which it has is the same 
as the one stored in metastore. Metastore updates the {{transient_lastDdlTime}} 
every time there is a DDL operation on the object. However, this value cannot 
be relied for all the clients since after HIVE-1768 metastore updates the value 
only when it is not set in the parameters. It is possible that a client which 
alters the object state, does not remove the {{transient_lastDdlTime}} and 
metastore will not update it. Secondly, if there is a clock skew between 
multiple HMS instances when HMS-HA is configured, time values cannot be relied 
on to find out the sequence of alter operations on a given object.

This JIRA propose to use JDO versioning support by Datanucleus  
http://www.datanucleus.org/products/accessplatform_4_2/jdo/versioning.html to 
generate a incrementing sequence number every time a object is altered. The 
value of this object can be set as one of the values in the parameters. The 
advantage of using Datanucleus the versioning can be done across HMS instances 
as part of the database transaction and it should work for all the supported 
databases.

In theory such a version can be used to detect if the client is presenting a 
object which is "stale" when issuing a alter request. Metastore can choose to 
reject such a alter request since the client may be caching a old version of 
the object and any alter operation on such stale object can potentially 
overwrite previous operations. However, this is can be done in a separate JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69585: HIVE-20776: Run HMS filterHooks on server-side in addition to client-side

2019-01-10 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69585/#review211843
---


Fix it, then Ship it!





standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 652-655 (patched)
<https://reviews.apache.org/r/69585/#comment297430>

I think this is misleading since this assumes that 
DefaultMetaStoreFilterHook will not add anything in the future too. We should 
either depend only on METASTORE_SERVER_FILTER_ENABLED and run whichever 
filterHook is configured or throw an error when METASTORE_SERVER_FILTER_ENABLED 
is true but FILTER_HOOK is empty or DefaultMetaStoreFilterHook instead of 
silently disabling the filtering logic.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 661 (patched)
<https://reviews.apache.org/r/69585/#comment297428>

I didn't realize that FILTER_HOOK takes only one value not a comma 
separated list of classnames. In that case, this code can be made simpler using 
existing util methods

Class clazz = 
JavaUtils.getClass(MetastoreConf.getVar(conf, ConfVars.FILTER_HOOK), 
MetaStoreFilterHook.class);

return JavaUtils.getInstance(clazz, new Class[] {Configuration.class}, 
new Object[] {conf});



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestFilterHooks.java
Lines 254 (patched)
<https://reviews.apache.org/r/69585/#comment297431>

Its not clear why this should throw NoSuchObjectException? Can you please 
add a comment. Are we changing the behavior of this API?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetastoreFilterHook.java
Lines 49 (patched)
<https://reviews.apache.org/r/69585/#comment297432>

This test looks very familiar to TestFilterHooks. What is the difference? 
If it is different, can you please add some javadoc on the top to explain what 
the test is doing. If there is not much difference can we refactor (or add 
these tests to TestFilterHooks) to re-use the code instead of duplicating it?


- Vihang Karajgaonkar


On Jan. 10, 2019, 9:10 p.m., Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69585/
> ---
> 
> (Updated Jan. 10, 2019, 9:10 p.m.)
> 
> 
> Review request for hive, Adam Holley, Karthick Sankarachary, Morio 
> Ramdenbourg, Peter Vary, Sergio Pena, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-20776
> https://issues.apache.org/jira/browse/HIVE-20776
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> add filtering to read result at HMS server, so user cannot see metadata 
> he/she has no privileges. Filtering is enabled/disabled based on 
> configuration.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  748b56b0a268c1ec7dea022722478ec50889c016 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  be1f8c78497fe3d0816ad3935ba07cd5ad379b08 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/FilterUtils.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestFilterHooks.java
>  7dc69bc4e92875c8962dcd313b16f0f90ea8b057 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetastoreFilterHook.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69585/diff/8/
> 
> 
> Testing
> ---
> 
> Existing unit tests passed. 
> add new unit tests for filtering at HMS server and HMS client
> add code to enabled/disable filtering at HMS client based on configuration
> add code to enabled/disable filtering at HMS server based on configuration
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-09 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

(Updated Jan. 9, 2019, 6:50 p.m.)


Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Changes
---

Fixed TestCacheStore test failure


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description
---

HIVE-21077 : Database and Catalogs should have creation time


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
 3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 2953a8f327eabdee42dbc66e0c65f89d17add59a 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
 e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 bb20d9f42a855100397140f9e018c04c5f61dde7 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
 28eb1fadca80dfd3c962e4163120b83f00410c4a 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/TestSchemaToolForMetastore.java
 c2eb6c9e22a22f09cc1d2cc6394aa4e0e339b63a 


Diff: https://reviews.apache.org/r/69664/diff/6/

Changes: https://reviews.apache.org/r/69664/diff/5-6/


Testing
---

Ran ITests to check the db install and upgrade scripts are working for mysql, 
postgres, oracle and derby databases. The ITests for mssql is timing out for 
some reason due to container provisioning issues.


Thanks,

Vihang Karajgaonkar



Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-08 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

(Updated Jan. 9, 2019, 1:01 a.m.)


Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Changes
---

Fixed test failures.


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description
---

HIVE-21077 : Database and Catalogs should have creation time


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
 3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 2953a8f327eabdee42dbc66e0c65f89d17add59a 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
 e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
 28eb1fadca80dfd3c962e4163120b83f00410c4a 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/TestSchemaToolForMetastore.java
 c2eb6c9e22a22f09cc1d2cc6394aa4e0e339b63a 


Diff: https://reviews.apache.org/r/69664/diff/4/

Changes: https://reviews.apache.org/r/69664/diff/3-4/


Testing
---

Ran ITests to check the db install and upgrade scripts are working for mysql, 
postgres, oracle and derby databases. The ITests for mssql is timing out for 
some reason due to container provisioning issues.


Thanks,

Vihang Karajgaonkar



Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-08 Thread Vihang Karajgaonkar via Review Board


> On Jan. 7, 2019, 8:28 p.m., Adam Holley wrote:
> > standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
> > Lines 137 (patched)
> > <https://reviews.apache.org/r/69664/diff/3/?file=2118327#file2118327line137>
> >
> > nit: extraneous space.

The api/Catalog.java and api/Database.java are auto-generated files which 
cannot not be edited.


> On Jan. 7, 2019, 8:28 p.m., Adam Holley wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 1063 (patched)
> > <https://reviews.apache.org/r/69664/diff/3/?file=2118333#file2118333line1063>
> >
> > Need parenthesis around (System.currentTimeMillis() / 1000).

Thanks for catching this.


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/#review211734
---


On Jan. 7, 2019, 7:16 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69664/
> ---
> 
> (Updated Jan. 7, 2019, 7:16 p.m.)
> 
> 
> Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.
> 
> 
> Bugs: HIVE-21077
> https://issues.apache.org/jira/browse/HIVE-21077
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21077 : Database and Catalogs should have creation time
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
>  3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
>  994797698a379e0b08604d73d2d6728a2fcee4df 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  2953a8f327eabdee42dbc66e0c65f89d17add59a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d43c0c1e70cffbebd39b05f89ec396227c58ac77 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
>  f3d2182a04ab81417a4ba58d9340721513e8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
>  e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
>  815b39c483b2233660310983d58194fb1ab2d107 
>   standalone-metastore/metastore-server/src/main/resources/package.jdo 
> caaec457194332a99d5cd57bef746e969dd38161 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  a3c4196dbff7e53be5317631b314983d16a99020 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  bcaebd18accf86846ae44a6498046514575fc069 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  edde08db9ef7ee01800c7cc3a04c813014abdd18 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  701acb00984c61f7511dcc48053890b154575d1f 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  b1980c5b83f16614845063516495188ebdd8c2a3 
>   
> standalone-metastore/metastore-server/sr

Re: Review Request 69585: HIVE-20776: Run HMS filterHooks on server-side in addition to client-side

2019-01-08 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69585/#review211768
---



Thanks for the patch. Do we really need to introduce 
authorizeTableForPartitionMetadata in these API calls. For the common case, it 
can potentially degrade API performance. For instance, for fetching a single 
partition, we are now doing a get_table and then get_partition for the common 
case. I think if it is not related to the functionality of this patch, we 
should do it in a separate patch with more investigation on its impact.


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 84 (patched)
<https://reviews.apache.org/r/69585/#comment297369>

redundant import?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 217 (patched)
<https://reviews.apache.org/r/69585/#comment297377>

If you want to initialize this member using init() it shouldn't be static 
since it relies on the conf object which is not static. Technically, there is a 
race-condition in this variable since it is being overwritten every time init() 
method is called with the instance specific conf object.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 649 (patched)
<https://reviews.apache.org/r/69585/#comment297367>

nit, formatting. The curly brace convention we follow is if () {
blah;
}

Easiest way to fix these errors is to import the code-style formatter xml 
file from the dev-support/eclipse-styles.xml (works for intellij too) and let 
IDE to reformat the newly added code.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 663-676 (patched)
<https://reviews.apache.org/r/69585/#comment297366>

You can reuse a existing method which does this with some minor renaming of 
the method and variables. The implementation of 
MetaStoreServerUtils.getMetaStoreListeners is generic enough to be used to any 
class. We probably can just rename it to more generic like 
getInstancesFromClass for example.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 667-674 (patched)
<https://reviews.apache.org/r/69585/#comment297370>





standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 682 (patched)
<https://reviews.apache.org/r/69585/#comment297372>

Is there a better way to do this? This method is introducing a additional 
db call for all the methods for the common case of users having the required 
permissions.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 3143-3145 (patched)
<https://reviews.apache.org/r/69585/#comment297371>

Shouldn't the FilterUtils.filterTables be used here for consistency?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 4610 (patched)
<https://reviews.apache.org/r/69585/#comment297373>

The original API is fetching only one partition, this method is not 
improving performance but rather degrading it since this would do a fetch table 
and fetch partition for the most common case. I think we should do this check 
only in case of fetching lots of partitions where the cost of doing one 
get_table call is relatively low compared to fetching lots of partitions.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 4655 (patched)
<https://reviews.apache.org/r/69585/#comment297374>

same comment as above. not sure if this method is helping much with the 
performance here.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 4683 (patched)
<https://reviews.apache.org/r/69585/#comment297375>

same comment as above. not sure if this method is helping much with the 
performance here.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 4711 (patched)
<https://reviews.apache.org/r/69585/#comment297376>

move this method call below checkLimitNumberOfPartitionsByFilter.



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetastoreFilterHook.java
Lines 318 (patched)
<https://reviews.apache.org/r/69585/#comment297378>

Don't we need a boolean argument here too to confirm that only server side 
filter logic is tested?


- Vihang Karajgaonkar


On Jan. 8, 2019, 8:03 p.m., Na Li wrote:
> 
> ---
> This is an auto

Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-07 Thread Vihang Karajgaonkar via Review Board


> On Jan. 4, 2019, 5:33 p.m., Karthik Manamcheri wrote:
> >
> 
> Karthik Manamcheri wrote:
> 1. While we are at it, can we make sure that the "create time" exists for 
> everything else (if required)? Such as catalog, partitions..
> 2. Out of curiosity.. are there any tests to ensure that the upgrade path 
> works? What if we miss adding this to some sql file? Would we catch it?
> 
> Vihang Karajgaonkar wrote:
> createTime exists for tables and partitions. It doesn't exist for 
> catalog. I will see if it doesn't add too much to this patch to add 
> createTime to catalogs. Otherwise, I will prefer to do it in a separate 
> patch. For upgrade testing the precommit job does not run such tests. It is 
> expected that schema patches are tested either manually or using the ITests 
> (steps are provided in DEV-README). I have tested this patch on mysql 
> manually and I am setting up my machine to give a try to the newly introduced 
> ITests currently. Will update the RB once I complete the testing.
> 
> Karthik Manamcheri wrote:
> Don't you think its better to make the schema changes all together 
> instead of splitting it up. If we think we might need creation time for 
> catalogs, we should add it to this changeset because we can knock off the 
> schema change as well.

It was not too hard to add createTime to catalog. Added that in the latest 
patch.


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/#review211677
-------


On Jan. 7, 2019, 7:16 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69664/
> ---
> 
> (Updated Jan. 7, 2019, 7:16 p.m.)
> 
> 
> Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.
> 
> 
> Bugs: HIVE-21077
> https://issues.apache.org/jira/browse/HIVE-21077
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21077 : Database and Catalogs should have creation time
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
>  3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
>  994797698a379e0b08604d73d2d6728a2fcee4df 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  2953a8f327eabdee42dbc66e0c65f89d17add59a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d43c0c1e70cffbebd39b05f89ec396227c58ac77 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
>  f3d2182a04ab81417a4ba58d9340721513e8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
>  e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
>  815b39c483b2233660310983d58194fb1ab2d107 
>   standalone-metastore/metastore-server/src/main/resources/package.jdo 
> caaec457194332a99d5cd57bef746e969dd38161 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  a3c4196dbff7e53be5317631b314983d16a99020 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  bcaebd18accf86846ae44a6498046514575fc069 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
>   
> standalone-metastore/metastore-server/src/mai

Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-07 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

(Updated Jan. 7, 2019, 7:16 p.m.)


Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Changes
---

added create time to catalogs as well as suggested.


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description
---

HIVE-21077 : Database and Catalogs should have creation time


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
 3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 2953a8f327eabdee42dbc66e0c65f89d17add59a 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
 e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
 28eb1fadca80dfd3c962e4163120b83f00410c4a 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 


Diff: https://reviews.apache.org/r/69664/diff/3/

Changes: https://reviews.apache.org/r/69664/diff/2-3/


Testing
---

Ran ITests to check the db install and upgrade scripts are working for mysql, 
postgres, oracle and derby databases. The ITests for mssql is timing out for 
some reason due to container provisioning issues.


Thanks,

Vihang Karajgaonkar



Re: Review Request 69664: HIVE-21077 : Database and Catalogs should have creation time

2019-01-04 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

(Updated Jan. 5, 2019, 2:54 a.m.)


Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Changes
---

Added createTime in catalog as suggested.


Summary (updated)
-

HIVE-21077 : Database and Catalogs should have creation time


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description (updated)
---

HIVE-21077 : Database and Catalogs should have creation time


Diffs (updated)
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Catalog.java
 3eb4dbd51110dd6e5d04c3bdacde2e5bdba09a7c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 2953a8f327eabdee42dbc66e0c65f89d17add59a 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MCatalog.java
 e82cb4322f6e2ac7afeb5efcec7517a68c8b2dee 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestCatalogs.java
 28eb1fadca80dfd3c962e4163120b83f00410c4a 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 


Diff: https://reviews.apache.org/r/69664/diff/2/

Changes: https://reviews.apache.org/r/69664/diff/1-2/


Testing (updated)
---

Ran ITests to check the db install and upgrade scripts are working for mysql, 
postgres, oracle and derby databases. The ITests for mssql is timing out for 
some reason due to container provisioning issues.


Thanks,

Vihang Karajgaonkar



[jira] [Created] (HIVE-21089) Automate dbinstall tests

2019-01-04 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21089:
--

 Summary: Automate dbinstall tests
 Key: HIVE-21089
 URL: https://issues.apache.org/jira/browse/HIVE-21089
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar


When a patch makes a schema change, precommit should run the dbinstall tests to 
make sure that the db scripts are working on all the supported databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21088) Improve usability of dbinstall tests

2019-01-04 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21088:
--

 Summary: Improve usability of dbinstall tests
 Key: HIVE-21088
 URL: https://issues.apache.org/jira/browse/HIVE-21088
 Project: Hive
  Issue Type: Sub-task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


There are nice integration tests which can be run manually for testing database 
schema changes. These tests spin up docker containers and install and upgrade 
the schema. Currently,  these tests expect that the host provides native 
support for docker daemon which is true in most cases. However, if you are 
using a lower version of macos (I tried it using 10.11), docker application 
cannot be installed and we need to install docker-toolbox instead. The issue 
with using docker-toolbox is that the docker daemon runs in a VM on the host 
which has a different IP address and hence the hardcoded {{localhost}} in the 
jdbc urls don't work. We can add a simple flag to provide the docker-machine ip 
as a  commandline arguemnt to override using localhost in the url.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69664: HIVE-21077 : Database should have creation time

2019-01-04 Thread Vihang Karajgaonkar via Review Board


> On Jan. 4, 2019, 5:33 p.m., Karthik Manamcheri wrote:
> >
> 
> Karthik Manamcheri wrote:
> 1. While we are at it, can we make sure that the "create time" exists for 
> everything else (if required)? Such as catalog, partitions..
> 2. Out of curiosity.. are there any tests to ensure that the upgrade path 
> works? What if we miss adding this to some sql file? Would we catch it?

createTime exists for tables and partitions. It doesn't exist for catalog. I 
will see if it doesn't add too much to this patch to add createTime to 
catalogs. Otherwise, I will prefer to do it in a separate patch. For upgrade 
testing the precommit job does not run such tests. It is expected that schema 
patches are tested either manually or using the ITests (steps are provided in 
DEV-README). I have tested this patch on mysql manually and I am setting up my 
machine to give a try to the newly introduced ITests currently. Will update the 
RB once I complete the testing.


> On Jan. 4, 2019, 5:33 p.m., Karthik Manamcheri wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 785 (patched)
> > <https://reviews.apache.org/r/69664/diff/1/?file=2117445#file2117445line785>
> >
> > Why not just store it in milliseconds?

This is to be consitent with the createTime stored in table and partitions


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/#review211677
-------


On Jan. 3, 2019, 11:44 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69664/
> ---
> 
> (Updated Jan. 3, 2019, 11:44 p.m.)
> 
> 
> Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.
> 
> 
> Bugs: HIVE-21077
> https://issues.apache.org/jira/browse/HIVE-21077
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21077 : Database should have creation time
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
>  994797698a379e0b08604d73d2d6728a2fcee4df 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  2953a8f327eabdee42dbc66e0c65f89d17add59a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e79404a15894aa42f451df5d18ed3e4c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d43c0c1e70cffbebd39b05f89ec396227c58ac77 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
>  f3d2182a04ab81417a4ba58d9340721513e8 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
>  815b39c483b2233660310983d58194fb1ab2d107 
>   standalone-metastore/metastore-server/src/main/resources/package.jdo 
> caaec457194332a99d5cd57bef746e969dd38161 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  a3c4196dbff7e53be5317631b314983d16a99020 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  bcaebd18accf86846ae44a6498046514575fc069 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  edde08db9ef7ee01800c7cc3a04c813014abdd18 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  701acb00984c61f7511dcc48053890b1545

Review Request 69664: HIVE-21077 : Database should have creation time

2019-01-03 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69664/
---

Review request for hive, Karthik Manamcheri, Naveen Gangam, and Peter Vary.


Bugs: HIVE-21077
https://issues.apache.org/jira/browse/HIVE-21077


Repository: hive-git


Description
---

HIVE-21077 : Database should have creation time


Diffs
-

  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java
 994797698a379e0b08604d73d2d6728a2fcee4df 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 2953a8f327eabdee42dbc66e0c65f89d17add59a 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 f8b862862de4dde8dce3d0dc5f70eafb67b02d2c 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 dfc5d7b294c1df8d5c6b0e7d676a77ba1181e076 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
7d09a5c296a8ae924d61b200b4cb9135440fd9a0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 a9398ae1e79404a15894aa42f451df5d18ed3e4c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 58dc6eefcb840d4dd70af7a47811fab1b5e696d9 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d43c0c1e70cffbebd39b05f89ec396227c58ac77 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/client/builder/DatabaseBuilder.java
 f3d2182a04ab81417a4ba58d9340721513e8 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MDatabase.java
 815b39c483b2233660310983d58194fb1ab2d107 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
caaec457194332a99d5cd57bef746e969dd38161 
  
standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
 a3c4196dbff7e53be5317631b314983d16a99020 
  
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 bcaebd18accf86846ae44a6498046514575fc069 
  
standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
 5ea1b4450d8258e841bb4af7381ca6fb0ba1a827 
  
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
 edde08db9ef7ee01800c7cc3a04c813014abdd18 
  
standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
 a59c7d7e933d25d8d5af611e5b6aa0c0c19b 
  
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
 701acb00984c61f7511dcc48053890b154575d1f 
  
standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
 b1980c5b83f16614845063516495188ebdd8c2a3 
  
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
 b9f63313251ab1fa6278b862ed9e07e62b234c04 
  
standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
 9040005aa82b7a8cc5c01f257ecd47a7cc97e9b2 
  
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
 0c36069d071d4b60cc338ba729da5d22e08ca8ca 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 d323ac6c90ed20f092b4e179fdb1bed8602ecf63 


Diff: https://reviews.apache.org/r/69664/diff/1/


Testing
---


Thanks,

Vihang Karajgaonkar



[jira] [Created] (HIVE-21077) Database should have creation time

2018-12-28 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21077:
--

 Summary: Database should have creation time
 Key: HIVE-21077
 URL: https://issues.apache.org/jira/browse/HIVE-21077
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Currently, database do not have creation time like we have for tables and 
partitions.

{noformat}
// namespace for tables
struct Database {
  1: string name,
  2: string description,
  3: string locationUri,
  4: map parameters, // properties associated with the database
  5: optional PrincipalPrivilegeSet privileges,
  6: optional string ownerName,
  7: optional PrincipalType ownerType,
  8: optional string catalogName
}
{noformat}

Currently, without creationTime there is no way to identify if the copy of 
Database which a client has is the same as the one no the server if the dbName 
is same. Without object ids creationTimeStamp is the only way currently to 
identify uniquely a instance of metastore object. It would be good to have 
Database creation time as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69534: HIVE-20992: Split the property "hive.metastore.dbaccess.ssl.properties" into more coherent and user-friendly properties.

2018-12-17 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69534/#review211367
---


Fix it, then Ship it!




Thanks for the changes. Couple of minor comments. Rest looks good to me.


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 133-135 (patched)
<https://reviews.apache.org/r/69534/#comment296374>

Nit, constants are recommended to be in upper-case.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 379 (patched)
<https://reviews.apache.org/r/69534/#comment296372>

you are catching IOException and changing it to IllegalArgumentException 
which may not be the right thing to do all the times. I would suggest changing 
this to RuntimeException since setConf() doesn't throw checked exceptions. 
Also, would be good to have e thrown as a cause as well, so would suggest using 
throw new RuntimeException("Failed to set .., e); constructor instead.



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Lines 1082 (patched)
<https://reviews.apache.org/r/69534/#comment296373>

If you decide to change the thrown exception to RuntimeException, this 
might need a change too.


- Vihang Karajgaonkar


On Dec. 16, 2018, 3:11 a.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69534/
> ---
> 
> (Updated Dec. 16, 2018, 3:11 a.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Peter Vary, and 
> Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-20992
> https://issues.apache.org/jira/browse/HIVE-20992
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following new properties were added:
> 
> 1. metastore.dbaccess.use.SSL (hive.metastore.dbaccess.use.SSL)
> 2. javax.net.ssl.trustStore
> 3. javax.net.ssl.trustStorePassword
> 4. javax.net.ssl.trustStoreType
> 
> This was in an effort to guide the user towards an easier SSL
> configuration experience. This is the minimum requirement to set up SSL
> encryption to the HMS backend store.
> 
> This also solves the issue of the truststore password being stored in
> plain text. It can now be encrypted by default and loaded through the
> MetastoreConf.getPassword() method which handles secure password access
> 
> The property "hive.metastore.dbaccess.ssl.properties" is now
> deprecated, but it will still be kept for backwards-compatibility purposes.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  e25a8cf9a19d78c0cc00bb2e5e0abee4d851ad98 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  3fa21b768cd120cd89343c9ccd142d5e2ccdef2e 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  0cf113c927f2274d085e07cd72921fb35227e1f3 
> 
> 
> Diff: https://reviews.apache.org/r/69534/diff/5/
> 
> 
> Testing
> ---
> 
> Tests:
> 1. Unit tests were added to cover the functionality of configuring the Java 
> system properties.
> 2. Performed some manual and sanity tests to ensure that SSL was still 
> configurable to a remote DB. I performed these on MySQL, PostgreSQL, Oracle, 
> and Derby DB by creating generic DB hosts and setting them up with SSL. Once 
> SSL was set up, I triggered the metastore to perform database calls, and 
> captured packets using tcpdump. I then uploaded my packet captures to 
> Wireshark, and ensured that none of the data was human-readable.
> 
> I plan to upload a document to our Wiki explaining the process of enabling 
> TLS to these databases.
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



Re: Review Request 69534: HIVE-20992: Split the property "hive.metastore.dbaccess.ssl.properties" into more coherent and user-friendly properties.

2018-12-14 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69534/#review211343
---




standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 462 (patched)
<https://reviews.apache.org/r/69534/#comment296327>

s/System/Java system



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 328 (original), 342 (patched)
<https://reviews.apache.org/r/69534/#comment296334>

This patch may introduce performance regression in the setConf method which 
is called for every new connection. MetastoreConf.getPassword is expensive 
since it needs to decrypt the truststore password. See HIVE-20740 which has 
more details. In most common cases, these configuration properties almost never 
change once they are set. But they are being read again and again at every new 
connection initialization time. I think we can improve this by caching the db 
and truststore password and reading it once when HMS starts. But I guess this 
could be separate from this JIRA.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 343 (patched)
<https://reviews.apache.org/r/69534/#comment296336>

may be rename this to configureSSLDeprecated



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 346 (patched)
<https://reviews.apache.org/r/69534/#comment296328>

This is a redundant log. If ssl is configured we have other logs below to 
tell us that. Note that logs printed in this method are going to be very 
frequent for each new HMS connection.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 354 (patched)
<https://reviews.apache.org/r/69534/#comment296329>

The exception message suggests taht it Disables SSL and continues, but 
actually, the exception is uncaught and it will terminate the connection 
request (and infact HMS coming up). I think you can remove "Disabling SSL and 
continuing." if thats not needed.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 356 (patched)
<https://reviews.apache.org/r/69534/#comment296330>

This comment is unclear. The getPassword method will get the encrypted 
password if configured right?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 367-369 (patched)
<https://reviews.apache.org/r/69534/#comment296331>

Can you define constants for javax.net.ssl* property keys at the class 
level?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 371 (patched)
<https://reviews.apache.org/r/69534/#comment296332>

Redundant log, please remove.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 373 (patched)
<https://reviews.apache.org/r/69534/#comment296333>

Don't see the code where we disable and continue. Am I missing something?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 332 (original), 398 (patched)
<https://reviews.apache.org/r/69534/#comment296335>

I would suggest change this log to say "Configuring SSL using a deprecated 
key " + ConfVars.DBACCESS_SSL_PROPS.toString() + ". This may be removed in the 
future. See HIVE-20992 for more details."



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
Lines 1069 (patched)
<https://reviews.apache.org/r/69534/#comment296337>

suggest rename to testDeprecatedConfigIsOverriden()


- Vihang Karajgaonkar


On Dec. 14, 2018, 1:26 a.m., Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69534/
> ---
> 
> (Updated Dec. 14, 2018, 1:26 a.m.)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Peter Vary, and 
> Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-20992
> https://issues.apache.org/jira/browse/HIVE-20992
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following new properties were added:
> 
> 1. metastore.dbaccess.use.SSL (hive.metastore.dbaccess.use.SSL)
> 2. javax.net.ssl.trustStore
> 3. javax.net.ssl.trustStorePassword
> 4. javax.net.ssl.trustStoreType
> 
> This was in an effort to guide the user towards an easier SSL
> configuration expe

[jira] [Created] (HIVE-21040) msck does unnecessary file listing at last level of partitions

2018-12-13 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21040:
--

 Summary: msck does unnecessary file listing at last level of 
partitions
 Key: HIVE-21040
 URL: https://issues.apache.org/jira/browse/HIVE-21040
 Project: Hive
  Issue Type: Improvement
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


Here is the code snippet which is run by {{msck}} to list directories

{noformat}
final Path currentPath = pd.p;
  final int currentDepth = pd.depth;
  FileStatus[] fileStatuses = fs.listStatus(currentPath, 
FileUtils.HIDDEN_FILES_PATH_FILTER);
  // found no files under a sub-directory under table base path; it is 
possible that the table
  // is empty and hence there are no partition sub-directories created 
under base path
  if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
partColNames.size()) {
// since maxDepth is not yet reached, we are missing partition
// columns in currentPath
logOrThrowExceptionWithMsg(
"MSCK is missing partition columns under " + 
currentPath.toString());
  } else {
// found files under currentPath add them to the queue if it is a 
directory
for (FileStatus fileStatus : fileStatuses) {
  if (!fileStatus.isDirectory() && currentDepth < partColNames.size()) {
// found a file at depth which is less than number of partition keys
logOrThrowExceptionWithMsg(
"MSCK finds a file rather than a directory when it searches for 
"
+ fileStatus.getPath().toString());
  } else if (fileStatus.isDirectory() && currentDepth < 
partColNames.size()) {
// found a sub-directory at a depth less than number of partition 
keys
// validate if the partition directory name matches with the 
corresponding
// partition colName at currentDepth
Path nextPath = fileStatus.getPath();
String[] parts = nextPath.getName().split("=");
if (parts.length != 2) {
  logOrThrowExceptionWithMsg("Invalid partition name " + nextPath);
} else if 
(!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
  logOrThrowExceptionWithMsg(
  "Unexpected partition key " + parts[0] + " found at " + 
nextPath);
} else {
  // add sub-directory to the work queue if maxDepth is not yet 
reached
  pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
}
  }
}
if (currentDepth == partColNames.size()) {
  return currentPath;
}
  }
{noformat}

You can see that when the {{currentDepth}} at the {{maxDepth}} it still does a 
unnecessary listing of the files. We can improve this call by checking the 
currentDepth and bailing out early.

This can improve the performance of msck command significantly especially when 
there are lot of files in each partitions on remote filesystems like S3 or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-11 Thread Vihang Karajgaonkar via Review Board


> On Dec. 11, 2018, 4:30 p.m., Vihang Karajgaonkar wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
> > Lines 217 (patched)
> > <https://reviews.apache.org/r/69550/diff/1/?file=2113035#file2113035line218>
> >
> > Do we need to do redact for the Spark config as well?
> 
> Denys Kuzmenko wrote:
> no, it's already handled in Spark:
> 
> https://github.com/apache/spark/commit/66636ef0b046e5d1f340c3b8153d7213fa9d19c7

Okay. Thanks for checking.


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/#review211199
---


On Dec. 11, 2018, 12:43 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69550/
> ---
> 
> (Updated Dec. 11, 2018, 12:43 p.m.)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-21030
> https://issues.apache.org/jira/browse/HIVE-21030
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Credstore decryption password should be redacted in JobConf
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
> 2ad5f9ee39f376d3466994a24cc9f7850be902ae 
> 
> 
> Diff: https://reviews.apache.org/r/69550/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-11 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/#review211205
---


Ship it!




Ship It!

- Vihang Karajgaonkar


On Dec. 11, 2018, 12:43 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69550/
> ---
> 
> (Updated Dec. 11, 2018, 12:43 p.m.)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-21030
> https://issues.apache.org/jira/browse/HIVE-21030
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Credstore decryption password should be redacted in JobConf
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
> 2ad5f9ee39f376d3466994a24cc9f7850be902ae 
> 
> 
> Diff: https://reviews.apache.org/r/69550/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 69550: Add credential store env properties redaction in JobConf

2018-12-11 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69550/#review211199
---




common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
Lines 217 (patched)
<https://reviews.apache.org/r/69550/#comment296129>

Do we need to do redact for the Spark config as well?


- Vihang Karajgaonkar


On Dec. 11, 2018, 12:43 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69550/
> ---
> 
> (Updated Dec. 11, 2018, 12:43 p.m.)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-21030
> https://issues.apache.org/jira/browse/HIVE-21030
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Credstore decryption password should be redacted in JobConf
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 
> 2ad5f9ee39f376d3466994a24cc9f7850be902ae 
> 
> 
> Diff: https://reviews.apache.org/r/69550/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: [ANNOUNCE] New committer: Bharathkrishna Guruvayoor Murali

2018-12-02 Thread Vihang Karajgaonkar
Congratulations Bharath!

On Sun, Dec 2, 2018 at 9:33 AM Sahil Takiar  wrote:

> Congrats Bharath!
>
> On Sun, Dec 2, 2018 at 11:14 AM Andrew Sherman
>  wrote:
>
> > Congratulations Bharath!
> >
> > On Sat, Dec 1, 2018 at 10:26 AM Ashutosh Chauhan 
> > wrote:
> >
> > > Apache Hive's Project Management Committee (PMC) has invited
> > > Bharathkrishna
> > > Guruvayoor Murali to become a committer, and we are pleased to announce
> > > that
> > > he has accepted.
> > >
> > > Bharath, welcome, thank you for your contributions, and we look forward
> > > your
> > > further interactions with the community!
> > >
> > > Ashutosh Chauhan (on behalf of the Apache Hive PMC)
> > >
> >
>
>
> --
> Sahil Takiar
> Software Engineer
> takiar.sa...@gmail.com | (510) 673-0309
>


Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-11-27 Thread Vihang Karajgaonkar via Review Board


> On Nov. 27, 2018, 10:55 p.m., Naveen Gangam wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java
> > Lines 162 (patched)
> > <https://reviews.apache.org/r/69054/diff/6/?file=2110525#file2110525line162>
> >
> > Should we acquire the writeLock before releasing the readLock to 
> > prevent another thread from entering this code? This way we dont have to 
> > re-check these as well
> >   if (prop == null || pmf == null || !propsFromConf.equals(prop))

AFAIK we cannot upgrade from readLock to a writeLock when using 
ReentrantReadWriteLock, only downgrade is possible (a thread which holds write 
lock can acquire read lock) 
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/#review210911
-------


On Nov. 27, 2018, 7:18 a.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69054/
> ---
> 
> (Updated Nov. 27, 2018, 7:18 a.m.)
> 
> 
> Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.
> 
> 
> Bugs: HIVE-20740
> https://issues.apache.org/jira/browse/HIVE-20740
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20740 : Remove global lock in ObjectStore.setConf method
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
>  5a88550f0625a7ec1890df7f54e7fa579f58fff4 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 5cb0a887e672f49739f5b648e608fba66de06326 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
> 455ffc3887e62fa503cc3fa28255702ea9da3cc0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  570281b54fa236d5bb568b4ded9b166ef367f613 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  af9efd98ea210335c6ac1d3da8624e02aadc2706 
> 
> 
> Diff: https://reviews.apache.org/r/69054/diff/6/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>



Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-11-26 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/
---

(Updated Nov. 27, 2018, 7:18 a.m.)


Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.


Changes
---

Rebased to the latest code on master.


Bugs: HIVE-20740
https://issues.apache.org/jira/browse/HIVE-20740


Repository: hive-git


Description
---

HIVE-20740 : Remove global lock in ObjectStore.setConf method


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
 5a88550f0625a7ec1890df7f54e7fa579f58fff4 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
5cb0a887e672f49739f5b648e608fba66de06326 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
455ffc3887e62fa503cc3fa28255702ea9da3cc0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 570281b54fa236d5bb568b4ded9b166ef367f613 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java
 PRE-CREATION 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 af9efd98ea210335c6ac1d3da8624e02aadc2706 


Diff: https://reviews.apache.org/r/69054/diff/6/

Changes: https://reviews.apache.org/r/69054/diff/5-6/


Testing
---


Thanks,

Vihang Karajgaonkar



[jira] [Created] (HIVE-20972) Enable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-26 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-20972:
--

 Summary: Enable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
 Key: HIVE-20972
 URL: https://issues.apache.org/jira/browse/HIVE-20972
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20916) Fix typo in JSONCreateDatabaseMessage

2018-11-14 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-20916:
--

 Summary: Fix typo in JSONCreateDatabaseMessage
 Key: HIVE-20916
 URL: https://issues.apache.org/jira/browse/HIVE-20916
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Vihang Karajgaonkar


{code}
public JSONCreateDatabaseMessage(String server, String servicePrincipal, 
Database db,
  Long timestamp) {
this.server = server;
this.servicePrincipal = servicePrincipal;
this.db = db.getName();
this.timestamp = timestamp;
try {
  this.dbJson = MessageBuilder.createDatabaseObjJson(db);
} catch (TException ex) {
  throw new IllegalArgumentException("Could not serialize Function object", 
ex);
}
checkValid();
  }
{code}

The exception message should say Database instead of Function. Also, the 
{{TestDbNotificationListener#createDatabase}} should be modified to make sure 
that the deserialized database object from the dbJson field matches with the 
original database object 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20860) Fix or disable TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]

2018-11-02 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-20860:
--

 Summary: Fix or disable 
TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit]
 Key: HIVE-20860
 URL: https://issues.apache.org/jira/browse/HIVE-20860
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar


Test failed in one of the precommit job. Looks like there is some case where 
there is additonal space in the diff

{noformat}
Error Message
Client Execution succeeded but contained differences (error code = 1) after 
executing cbo_limit.q 
11c11
<  1  4 2
---
>  1 4 2
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69054: HIVE-20740 : Remove global lock in ObjectStore.setConf method

2018-11-01 Thread Vihang Karajgaonkar via Review Board


> On Oct. 18, 2018, 2:33 p.m., Karthik Manamcheri wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 244 (original), 341 (patched)
> > <https://reviews.apache.org/r/69054/diff/2/?file=2098740#file2098740line347>
> >
> > You are ignoring the return value? Should you have 
> > pmf=getUpdatedPmfIfNeeded(..)?
> 
> Vihang Karajgaonkar wrote:
> The pmf is updated by the method if needed, so we don't need to use the 
> return value. Will rename the method to updatePmfIfNeeded to make it more 
> readable.
> 
> Karthik Manamcheri wrote:
> And you can have it has void return.

comment does not apply to latest version of the patch. Dropping it.


> On Oct. 18, 2018, 2:33 p.m., Karthik Manamcheri wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Line 287 (original), 374 (patched)
> > <https://reviews.apache.org/r/69054/diff/2/?file=2098740#file2098740line390>
> >
> > What is dss.log?

Actually, I don't know what it is. Looking at the git history, this line has 
been there forever. Looks like its about time to update it :)


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69054/#review209742
---


On Nov. 1, 2018, 8:48 p.m., Vihang Karajgaonkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69054/
> ---
> 
> (Updated Nov. 1, 2018, 8:48 p.m.)
> 
> 
> Review request for hive, Andrew Sherman, Alan Gates, and Peter Vary.
> 
> 
> Bugs: HIVE-20740
> https://issues.apache.org/jira/browse/HIVE-20740
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20740 : Remove global lock in ObjectStore.setConf method
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
>  75cd68a9d6be6c5804e458d19a0023f0d7f5beae 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 5cb0a887e672f49739f5b648e608fba66de06326 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 
> ee7c940d2b7fbd66af2d006da0585c6b42b9b0bb 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  03e3a2d2573b54651833867b906821650f4fb9c1 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
>  b74c3048fa2e18adc7f0d7cc813a180d4466fa36 
> 
> 
> Diff: https://reviews.apache.org/r/69054/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>



  1   2   3   4   5   6   >