from:"\?\?\?\?\?\?\?\?"

[ANNOUNCE] New PMC Member: Stamatis Zampetakis

2023-01-13 Thread Naveen Gangam

Hello Hive Community,
Apache Hive PMC is pleased to announce that Stamatis Zampetakis has
accepted the Apache Hive PMC's invitation to become PMC Member, and is now
our newest PMC member. Please join me in congratulating Stamatis !!!

He has been an active member in the hive community across many aspects of
the project. Many thanks to Stamatis for all the contributions he has made
and looking forward to many more future contributions in the expanded role.

Cheers,
Naveen (on behalf of Hive PMC)

[jira] [Created] (HIVE-26939) Hive LLAP Application Master fails to come up with Hadoop 3.3.4

2023-01-12 Thread Aman Raj (Jira)

Aman Raj created HIVE-26939:
---

 Summary: Hive LLAP Application Master fails to come up with Hadoop 
3.3.4
 Key: HIVE-26939
 URL: https://issues.apache.org/jira/browse/HIVE-26939
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Aman Raj
Assignee: Aman Raj


When current oss master hive tries to bring up the LLAP Application Master, it 
fails with this issue :
{code:java}
Executing the launch command\nINFO client.ServiceClient: Loading service 
definition from local FS: 
/var/lib/ambari-agent/tmp/llap-yarn-service_2023-01-10_07-56-46/Yarnfile\nERROR 
utils.JsonSerDeser: Exception while parsing json input 
stream\ncom.fasterxml.jackson.databind.exc.InvalidFormatException: Cannot 
deserialize value of type 
`org.apache.hadoop.yarn.service.api.records.PlacementScope` from String 
\"NODE\": not one of the values accepted for Enum class: [node, rack]\n at 
[Source: (org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream); 
line: 31, column: 22] (through reference chain: 
org.apache.hadoop.yarn.service.api.records.Service[\"components\"]->java.util.ArrayList[0]->org.apache.hadoop.yarn.service.api.records.Component[\"placement_policy\"]->org.apache.hadoop.yarn.service.api.records.PlacementPolicy[\"constraints\"]->java.util.ArrayList[0]->org.apache.hadoop.yarn.service.api.records.PlacementConstraint[\"scope\"])\n\tat
 
com.fasterxml.jackson.databind.exc.InvalidFormatException.from(InvalidFormatException.java:67)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.DeserializationContext.weirdStringException(DeserializationContext.java:1851)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.DeserializationContext.handleWeirdStringValue(DeserializationContext.java:1079)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.EnumDeserializer._deserializeAltString(EnumDeserializer.java:339)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.EnumDeserializer._fromString(EnumDeserializer.java:214)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.EnumDeserializer.deserialize(EnumDeserializer.java:188)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:355)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:355)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:324)
 ~[jackson-databind-2.12.7.jar:2.12.7]\n\tat 
com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)

Re: Proposal: Revamp Apache Hive website.

2023-01-12 Thread Chris Nauroth

That looks great! Thank you so much for your efforts, Simhadri!

Chris Nauroth

On Thu, Jan 12, 2023 at 2:52 AM Simhadri G  wrote:

> Hello Everyone,
>
> Happy new year!
>
> I am happy to announce that the new Apache Hive website[1] is finally up
> and running.
> It can be accessed here: https://hive.apache.org/
>
> I would like to specially thank Stamatis, Ayush, Sai Heamanth for
> reviewing the PR. Without their help, the new website would not have
> reached completion.
> I would also like to thank Owen O'Malley, Daniel Gruno,  Alessandro
> Solimando and Pau Tallada for the help and feedback received during the
> process.
>
> Thank you,
> Simhadri G
>
> [1]https://hive.apache.org/
> [2]HIVE-26565  :
> https://issues.apache.org/jira/browse/HIVE-26565
> [2] INFRA-24077  :
> https://issues.apache.org/jira/browse/INFRA-24077
>
> On Mon, Jan 9, 2023 at 4:56 PM Stamatis Zampetakis 
> wrote:
>
>> Hi everyone,
>>
>> Simhadri has been working hard to modernize the Hive website (HIVE-26565)
>> for the past few months and I am quite happy with the results.
>>
>> I reviewed the respective PR [1] and will commit the changes in 24h
>> unless there are objections.
>>
>> Best,
>> Stamatis
>>
>> [1] https://github.com/apache/hive-site/pull/2
>>
>> On Wed, Oct 5, 2022 at 8:46 PM Simhadri G  wrote:
>>
>>> Thanks for the feedback Stamatis !
>>>
>>>- I have updated the PR to include a README.md file with
>>>instructions to build and view the site locally after making any new
>>>changes. This will help us preview the changes locally before pushing the
>>>commit. (Docker is not required here.)
>>>
>>>- Github pages was used to share the new website with the community
>>>and it will most likely not be necessary later on.
>>>
>>>- Regarding the role of Github Actions(gh-pages.yml):
>>>
>>>- Whenever a PR is merged to the main branch, a github action is
>>>   triggered .
>>>   - Github action will install a hugo and build the site with the
>>>   new changes.  Once the build is successful, HUGO then generates a set 
>>> of
>>>   static files and these files are automatically merged to the
>>>   hive-site/asf-site branch by github actions bot.
>>>   - From here, to publish  hive-site/asf-site to project web site
>>>   sub-domain (hive.apache.org),  we need to set up a configuration
>>>   block called publish in your .asf.yaml file. (
>>>   
>>> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).
>>>
>>>   - We will need help from apache infra - gmcdonald
>>>    or
>>>   Humbedooh
>>>    to
>>>   make sure that we have set this up correctly.
>>>
>>>   - I agree with your suggestion to keep the changes around the
>>>revamp as minimal as possible and not mix the content update with the
>>>framework change. In this case, we can make the other changes 
>>> incrementally
>>>at a later stage.
>>>
>>>
>>> Thanks!
>>> Simhadri G
>>>
>>> On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
>>> wrote:
>>>
 Thanks for staying on top of this Simhadri.

 I will try to help reviewing the PR once I get some time.

 What is not yet clear to me from this discussion or by looking at the
 PR is the workflow for making a change appear on the web (
 https://hive.apache.org/). Having a README which clearly states what
 needs to be done is a must.

 I also think it is quite important to have instructions and possibly
 docker images for someone to be able to test how the changes look locally
 before commiting a change to the repo.

 Another point that needs clarification is the role of github pages. I
 am not sure why it is necessary at the moment and what exactly is the plan
 going forward. If I understand well, currently it is used to preview the
 changes but from my perspective we shouldn't need to commit something to
 the repo to understand if something breaks or not; preview should happen
 locally.

 I would suggest to keep the changes around the revamp as minimal as
 possible and not mix the content update with the framework change. As
 usual, smaller changes are easier to review and merge. It is definitely
 worth updating and improving the content but let's do it incrementally so
 that changes can get merged faster.

 The list of committers and PMC members for Hive can be found in the
 apache phonebook [1]. The list can easily get outdated so maybe we can
 consider adding links to [1] and/or github and other places instead of
 duplicating the content. Anyways, let's first deal with the revamp and
 discuss content changes

[jira] [Created] (HIVE-26938) Investigate SMB Map Join for FULL OUTER

2023-01-12 Thread John Sherman (Jira)

John Sherman created HIVE-26938:
---

 Summary: Investigate SMB Map Join for FULL OUTER
 Key: HIVE-26938
 URL: https://issues.apache.org/jira/browse/HIVE-26938
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: John Sherman
Assignee: John Sherman


HIVE-18908 added FULL OUTER Map Join support but this work did not add support 
for SMB Map Joins for FULL OUTER.

We should investigate if we can safely support SMB Map Join for this scenario 
and implement it if so.

This is the area in which it gives up conversion, if we modify this line to 
pass a 2nd argument  of true to getBigTableCandidates to enable 
isFullOuterJoinSupported - it does successfully convert (but we need to verify 
that execution does the correct thing).

[https://github.com/apache/hive/blob/03ad025ada776c0d359124c6342615f1983c1a94/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java#L482]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Proposal: Revamp Apache Hive website.

2023-01-12 Thread Simhadri G

Hello Everyone,

Happy new year!

I am happy to announce that the new Apache Hive website[1] is finally up
and running.
It can be accessed here: https://hive.apache.org/

I would like to specially thank Stamatis, Ayush, Sai Heamanth for reviewing
the PR. Without their help, the new website would not have reached
completion.
I would also like to thank Owen O'Malley, Daniel Gruno,  Alessandro
Solimando and Pau Tallada for the help and feedback received during the
process.

Thank you,
Simhadri G

[1]https://hive.apache.org/
[2]HIVE-26565  :
https://issues.apache.org/jira/browse/HIVE-26565
[2] INFRA-24077  :
https://issues.apache.org/jira/browse/INFRA-24077

On Mon, Jan 9, 2023 at 4:56 PM Stamatis Zampetakis 
wrote:

> Hi everyone,
>
> Simhadri has been working hard to modernize the Hive website (HIVE-26565)
> for the past few months and I am quite happy with the results.
>
> I reviewed the respective PR [1] and will commit the changes in 24h unless
> there are objections.
>
> Best,
> Stamatis
>
> [1] https://github.com/apache/hive-site/pull/2
>
> On Wed, Oct 5, 2022 at 8:46 PM Simhadri G  wrote:
>
>> Thanks for the feedback Stamatis !
>>
>>- I have updated the PR to include a README.md file with instructions
>>to build and view the site locally after making any new changes. This will
>>help us preview the changes locally before pushing the commit. (Docker is
>>not required here.)
>>
>>- Github pages was used to share the new website with the community
>>and it will most likely not be necessary later on.
>>
>>- Regarding the role of Github Actions(gh-pages.yml):
>>
>>- Whenever a PR is merged to the main branch, a github action is
>>   triggered .
>>   - Github action will install a hugo and build the site with the
>>   new changes.  Once the build is successful, HUGO then generates a set 
>> of
>>   static files and these files are automatically merged to the
>>   hive-site/asf-site branch by github actions bot.
>>   - From here, to publish  hive-site/asf-site to project web site
>>   sub-domain (hive.apache.org),  we need to set up a configuration
>>   block called publish in your .asf.yaml file. (
>>   
>> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).
>>
>>   - We will need help from apache infra - gmcdonald
>>    or
>>   Humbedooh
>>    to
>>   make sure that we have set this up correctly.
>>
>>   - I agree with your suggestion to keep the changes around the
>>revamp as minimal as possible and not mix the content update with the
>>framework change. In this case, we can make the other changes 
>> incrementally
>>at a later stage.
>>
>>
>> Thanks!
>> Simhadri G
>>
>> On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
>> wrote:
>>
>>> Thanks for staying on top of this Simhadri.
>>>
>>> I will try to help reviewing the PR once I get some time.
>>>
>>> What is not yet clear to me from this discussion or by looking at the PR
>>> is the workflow for making a change appear on the web (
>>> https://hive.apache.org/). Having a README which clearly states what
>>> needs to be done is a must.
>>>
>>> I also think it is quite important to have instructions and possibly
>>> docker images for someone to be able to test how the changes look locally
>>> before commiting a change to the repo.
>>>
>>> Another point that needs clarification is the role of github pages. I am
>>> not sure why it is necessary at the moment and what exactly is the plan
>>> going forward. If I understand well, currently it is used to preview the
>>> changes but from my perspective we shouldn't need to commit something to
>>> the repo to understand if something breaks or not; preview should happen
>>> locally.
>>>
>>> I would suggest to keep the changes around the revamp as minimal as
>>> possible and not mix the content update with the framework change. As
>>> usual, smaller changes are easier to review and merge. It is definitely
>>> worth updating and improving the content but let's do it incrementally so
>>> that changes can get merged faster.
>>>
>>> The list of committers and PMC members for Hive can be found in the
>>> apache phonebook [1]. The list can easily get outdated so maybe we can
>>> consider adding links to [1] and/or github and other places instead of
>>> duplicating the content. Anyways, let's first deal with the revamp and
>>> discuss content changes later in separate JIRAs/PRs.
>>>
>>> Best,
>>> Stamatis
>>>
>>> [1] https://home.apache.org/phonebook.html?project=hive
>>>
>>> On Sun, Oct 2, 2022 at 2:41 AM Simhadri G  wrote:
>>>
 Hello Everyone,

 I have raised the PR for the revamped Hive Website here:

[jira] [Created] (HIVE-26937) Batch events during incremental replication to avoid O.O.M

2023-01-12 Thread Rakshith C (Jira)

Rakshith C created HIVE-26937:
-

 Summary: Batch events during incremental replication to avoid O.O.M
 Key: HIVE-26937
 URL: https://issues.apache.org/jira/browse/HIVE-26937
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Rakshith C
Assignee: Rakshith C


* Currently incremental replication flow of hive dumps all events read from 
notification logs sequentially in staging directory.
 * Repl Load loads all the event directories present in staging directory to a 
list and processes them.
 * This has caused O.O.M issues when number of events are large.

Hence introducing batching of events where Repl Dump dumps events in batches 
and Repl Load processes events batch by batch to avoid O.O.M.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26936) A predicate that compares 0 with -0 yields an incorrect result

2023-01-11 Thread Dayakar M (Jira)

Dayakar M created HIVE-26936:


 Summary: A predicate that compares 0 with -0 yields an incorrect 
result
 Key: HIVE-26936
 URL: https://issues.apache.org/jira/browse/HIVE-26936
 Project: Hive
  Issue Type: Bug
Reporter: Dayakar M
Assignee: Dayakar M


Steps to reproduce:
CREATE TABLE t0(c0 INT);CREATE TABLE t1(c0 DOUBLE);INSERT INTO t0 
VALUES(0);INSERT INTO t1 VALUES('-0');
SELECT * FROM t0, t1 WHERE t0.c0 = t1.c0; -- expected: \{0.0, -0.0}, actual: 
{}+++
| t0.c0  | t1.c0  |
+++
+++
That the predicate should evaluate to TRUE can be verified with the following 
statement:
SELECT t0.c0 = t1.c0 FROM t0, t1; -- 1+---+
|  _c0  |
+---+
| true  |
+---+
Similar issue fixed earlier as a part of [link 
HIVE-11174|https://issues.apache.org/jira/browse/HIVE-11174]  for where clause 
condition, now join condition is having issue.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Fix the hiveserver2 error(No valid credentials provided) connecting metastore when HADOOP_USER_NAME environment variable exists and kerberos is enabled

2023-01-11 Thread Weiliang Hao

hi, all
I found a problem. When kerberos is enabled, if the environment variable 
HADOOP_ USER_ NAME is set, hiveserver2 will report an error（No valid 
credentials provided） when connecting to hive metastore.
I want to fix it and have submitted the PR. Who can help me review？THX!
jira: https://issues.apache.org/jira/browse/HIVE-26739?filter=-2
PR: https://github.com/apache/hive/pull/3764


Weiliang Hao

[jira] [Created] (HIVE-26935) Expose root cause of MetaException to client sides

2023-01-11 Thread Wechar (Jira)

Wechar created HIVE-26935:
-

 Summary: Expose root cause of MetaException to client sides
 Key: HIVE-26935
 URL: https://issues.apache.org/jira/browse/HIVE-26935
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 4.0.0-alpha-2
Reporter: Wechar
Assignee: Wechar


MetaException is generated by thrift, and only {{message}} filed will be 
transport to client, we should expose the root cause in message to the clients 
with following advantages:
 * More friendly for user troubleshooting
 * Some root cause is unrecoverable, exposing it can disable the unnecessary 
retry.

*How to Reproduce:*
 - Step 1: Disable direct sql for HMS for our test case.
 - Step 2: Add an illegal {{PART_COL_STATS}} for a partition,
 - Step 3: Try to {{drop table}} with Spark.

The exception in Hive metastore is:
{code:sh}
2023-01-11T17:13:51,259 ERROR [Metastore-Handler-Pool: Thread-39]: 
metastore.ObjectStore (ObjectStore.java:run(4369)) - 
javax.jdo.JDOUserException: One or more instances could not be deleted
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:625)
 ~[datanucleus-api-jdo-5.2.8.jar:?]
at 
org.datanucleus.api.jdo.JDOQuery.deletePersistentInternal(JDOQuery.java:530) 
~[datanucleus-api-jdo-5.2.8.jar:?]
at 
org.datanucleus.api.jdo.JDOQuery.deletePersistentAll(JDOQuery.java:499) 
~[datanucleus-api-jdo-5.2.8.jar:?]
at 
org.apache.hadoop.hive.metastore.QueryWrapper.deletePersistentAll(QueryWrapper.java:108)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsNoTxn(ObjectStore.java:4207)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$1000(ObjectStore.java:285) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$7.run(ObjectStore.java:3086) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:74) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsViaJdo(ObjectStore.java:3074)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$400(ObjectStore.java:285) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$6.getJdoResult(ObjectStore.java:3058)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$6.getJdoResult(ObjectStore.java:3050)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:4362)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsInternal(ObjectStore.java:3061)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitions(ObjectStore.java:3040)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_332]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_332]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_332]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy24.dropPartitions(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.HMSHandler.dropPartitionsAndGetLocations(HMSHandler.java:3186)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.drop_table_core(HMSHandler.java:2963)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.drop_table_with_environment_context(HMSHandler.java:3211)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.drop_table_with_environment_context(HMSHandler.java:3199)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_332]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_332]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_332]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:146)

[jira] [Created] (HIVE-26934) Update TXN_WRITE_NOTIFICATION_LOG table schema to make WNL_TABLE width consistent with other tables

2023-01-11 Thread Harshal Patel (Jira)

Harshal Patel created HIVE-26934:


 Summary: Update TXN_WRITE_NOTIFICATION_LOG table schema to make 
WNL_TABLE width consistent with other tables
 Key: HIVE-26934
 URL: https://issues.apache.org/jira/browse/HIVE-26934
 Project: Hive
  Issue Type: Improvement
Reporter: Harshal Patel


* TXN_WRITE_NOTIFICATION_LOG has table as varchar(128) and in other metastore 
tables, hive table type is varchar(256).
 * So, if user creates hive table with name > 128 characters then during 
Incremental Repl Load operation, EVENT_COMMIT_TXN event reads table name from 
the TXN_WRITE_NOTIFICATION_LOG table and it only gets first 128 characters and 
eventually creates wrong file path



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle

2023-01-11 Thread Harshal Patel (Jira)

Harshal Patel created HIVE-26933:


 Summary: Cleanup dump directory for eventId which was failed in 
previous dump cycle
 Key: HIVE-26933
 URL: https://issues.apache.org/jira/browse/HIVE-26933
 Project: Hive
  Issue Type: Improvement
Reporter: Harshal Patel
Assignee: Harshal Patel


# If Incremental Dump operation failes while dumping any event id  in the 
staging directory. Then dump directory for this event id along with file 
_dumpmetadata  still exists in the dump location. which is getting stored in 
_events_dump file
 # When user triggers dump operation for this policy again, It again resumes 
dumping from failed event id, and tries to dump it again but as that event id 
directory already created in previous cycle, it fails with the exception

{noformat}
[Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: 
FAILED: Execution Error, return code 4 from 
org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
org.apache.hadoop.fs.FileAlreadyExistsException: 
/warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata
 for client 172.27.182.5 already exists
    at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773)
    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490)
    at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26932) Correct stage name value in replication_metrics.progress column in replication_metrics table

2023-01-11 Thread Harshal Patel (Jira)

Harshal Patel created HIVE-26932:


 Summary: Correct stage name value in replication_metrics.progress 
column in replication_metrics table
 Key: HIVE-26932
 URL: https://issues.apache.org/jira/browse/HIVE-26932
 Project: Hive
  Issue Type: Improvement
Reporter: Harshal Patel
Assignee: Harshal Patel


 To improve diagnostic capability from Source to backup replication, update 
replication_metrics table by adding pre_optimized_bootstrap in progress bar in 
case of optimized bootstrap first cycle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26931) REPL LOAD command does not throw any error for incorrect syntax

2023-01-11 Thread Subhasis Gorai (Jira)

Subhasis Gorai created HIVE-26931:
-

 Summary: REPL LOAD command does not throw any error for incorrect 
syntax
 Key: HIVE-26931
 URL: https://issues.apache.org/jira/browse/HIVE-26931
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Subhasis Gorai


In some cases, users are using the REPL LOAD command incorrectly. It does not 
really throw any meaningful error/warning message, and as expected it does not 
replicate the database as well.

For example,
{code:java}
repl load target_db with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/hive/repl',
 'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/target_db.db'){code}
The above command does not follow the REPL LOAD syntax. This does not produce 
any error message, nor it replicates the database. So, it causes confusion.
{code:java}
0: jdbc:hive2://nightly7x-us-bj-3.nightly7x-u> repl load test_1_replica with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 
'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db');
INFO  : Compiling 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd): repl 
load test_1_replica with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 
'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db')
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd); Time 
taken: 0.051 seconds
INFO  : Executing 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd): repl 
load test_1_replica with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 
'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db')
INFO  : Completed executing 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd); Time 
taken: 0.001 seconds
INFO  : OK
No rows affected (0.065 seconds)
0: jdbc:hive2://nightly7x-us-bj-3.nightly7x-u>{code}
Ideally, since this is a wrong command, it should throw an error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26930) Support for increased retention of Notification Logs and Change Manager entries

2023-01-11 Thread Subhasis Gorai (Jira)

Subhasis Gorai created HIVE-26930:
-

 Summary: Support for increased retention of Notification Logs and 
Change Manager entries
 Key: HIVE-26930
 URL: https://issues.apache.org/jira/browse/HIVE-26930
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Subhasis Gorai
Assignee: Subhasis Gorai


In order to support the Planned/Unplanned Failover use cases, we need the 
capability to increase the retention period for both the Notification Logs and 
Change Manager entries until the successful reverse replication is done (i.e. 
the Optimized Bootstrap).

If the relevant Notification logs and Change Manager entries are not retained, 
we can't perform a successful Optimized Bootstrap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26929) Allow creating iceberg tables without column definition when 'metadata_location' tblproperties is set.

2023-01-11 Thread Dharmik Thakkar (Jira)

Dharmik Thakkar created HIVE-26929:
--

 Summary: Allow creating iceberg tables without column definition 
when 'metadata_location' tblproperties is set.
 Key: HIVE-26929
 URL: https://issues.apache.org/jira/browse/HIVE-26929
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: Dharmik Thakkar


Allow creating iceberg tables without column definition when 
'metadata_location' tblproperties is set.

Iceberg supports pointing to external metadata.json file to infer table schema. 
Irrespective of the schema defined as part of create table statement the 
metadata.json is used to create table. We should allow creating table without 
column definition in case the metadata_location is defined in tblproperties.
{code:java}
create table test_meta (id int, name string, cgpa decimal) stored by iceberg 
stored as orc;
describe formatted test_meta;
create table test_meta_copy(id int) stored by iceberg 
tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json');{code}
As a result of above SQL we get test_meta_copy with same schema as test_meta 
irrespective of the columns specified in create table statement.
|{color:#00}*col_name*{color}|{color:#00}*data_type*{color}|
|{color:#00}*id*{color}|{color:#00}int{color}|
|{color:#00}*name*{color}|{color:#00}string{color}|
|{color:#00}*cgpa*{color}|{color:#00}decimal(10,0){color}|
| |{color:#00}NULL{color}|
|{color:#00}*# Detailed Table 
Information*{color}|{color:#00}NULL{color}|
|{color:#00}*Database:*           
{color}|{color:#00}iceberg_test_db_hive{color}|
|{color:#00}*OwnerType: *         {color}|{color:#00}USER               
 {color}|
|{color:#00}*Owner: *             {color}|{color:#00}hive               
 {color}|
|{color:#00}*CreateTime:*         {color}|{color:#00}Tue Jan 10 
21:49:08 UTC 2023{color}|
|{color:#00}*LastAccessTime:*     {color}|{color:#00}Fri Dec 12 
21:41:41 UTC 1969{color}|
|{color:#00}*Retention: *         {color}|{color:#00}2147483647{color}|
|{color:#00}*Location:*           
{color}|{color:#00}+s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta+{color}|
|{color:#00}*Table Type:*         {color}|{color:#00}EXTERNAL_TABLE     
 {color}|
|{color:#00}*Table Parameters:*{color}|{color:#00}NULL{color}|
| |{color:#00}EXTERNAL            {color}|
| |{color:#00}bucketing_version   {color}|
| |{color:#00}engine.hive.enabled{color}|
| |{color:#00}metadata_location   {color}|
| |{color:#00}numFiles            {color}|
| |{color:#00}numRows             {color}|
| |{color:#00}rawDataSize         {color}|
| |{color:#00}serialization.format{color}|
| |{color:#00}storage_handler     {color}|
| |{color:#00}table_type          {color}|
| |{color:#00}totalSize           {color}|
| |{color:#00}transient_lastDdlTime{color}|
| |{color:#00}uuid                {color}|
| |{color:#00}write.format.default{color}|
| |{color:#00}NULL{color}|
|{color:#00}*# Storage Information*{color}|{color:#00}NULL{color}|
|{color:#00}*SerDe Library: *     
{color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergSerDe{color}|
|{color:#00}*InputFormat: *       
{color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergInputFormat{color}|
|{color:#00}*OutputFormat:*       
{color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergOutputFormat{color}|
|{color:#00}*Compressed:*         {color}|{color:#00}No                 
 {color}|
|{color:#00}*Sort Columns:*       {color}|{color:#00}[]                 
 {color}|

However if we skip passing column definition the query fails
{code:java}
create table test_meta_copy2 stored by iceberg 
tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json');{code}
error
{code:java}
INFO  : Compiling 
command(queryId=hive_20230110220019_94ffafef-f531-4532-a07c-0e46e3879f19): 
create table test_meta_copy2 stored by iceberg 
tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json')
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema:

[jira] [Created] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled

2023-01-10 Thread Rajesh Balamohan (Jira)

Rajesh Balamohan created HIVE-26928:
---

 Summary: LlapIoImpl::getParquetFooterBuffersFromCache throws 
exception when metadata cache is disabled
 Key: HIVE-26928
 URL: https://issues.apache.org/jira/browse/HIVE-26928
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: Rajesh Balamohan


When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
following error.

It should check for "metadatacache" correctly or fix it in LlapIoImpl.

 
{noformat}

Caused by: java.lang.NullPointerException: Metadata cache must not be null
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467)
    at 
org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227)
    at 
org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65)
    at 
org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77)
    at 
org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266)
    at 
org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40)
    at 
org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26927) Iceberg: Add support for set_current_snapshotid

2023-01-10 Thread Rajesh Balamohan (Jira)

Rajesh Balamohan created HIVE-26927:
---

 Summary: Iceberg: Add support for set_current_snapshotid
 Key: HIVE-26927
 URL: https://issues.apache.org/jira/browse/HIVE-26927
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: Rajesh Balamohan


Currently, hive supports "rollback" feature. Once rolledback,  it is not 
possible to move from older snapshot to newer snapshot.

It ends up throwing 
{color:#0747a6}"org.apache.iceberg.exceptions.ValidationException: Cannot roll 
back to snapshot, not an ancestor of the current state:" {color}error.

It will be good to support "set_current_snapshot" function to move to different 
snapshot ids.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26926) SHOW PARTITIONS for a non partitioned table should just throw execution error instead of full stack trace.

2023-01-10 Thread Dharmik Thakkar (Jira)

Dharmik Thakkar created HIVE-26926:
--

 Summary: SHOW PARTITIONS for a non partitioned table should just 
throw execution error instead of full stack trace.
 Key: HIVE-26926
 URL: https://issues.apache.org/jira/browse/HIVE-26926
 Project: Hive
  Issue Type: Bug
Reporter: Dharmik Thakkar


SHOW PARTITIONS for a non partitioned table should just throw execution error 
instead of full stack trace.

STR:
 # create table test (id int);
 # show partitions test;

Actual Output
{code:java}
0: jdbc:hive2://hs2-qe-vw-dwx-hive-nnbm.dw-dw> create table test (id int);
INFO  : Compiling 
command(queryId=hive_20230110210715_637ef126-bb53-4624-9a72-d36f13f98a93): 
create table test (id int)
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=hive_20230110210715_637ef126-bb53-4624-9a72-d36f13f98a93); Time 
taken: 0.036 seconds
INFO  : Executing 
command(queryId=hive_20230110210715_637ef126-bb53-4624-9a72-d36f13f98a93): 
create table test (id int)
INFO  : Starting task [Stage-0:DDL] in serial mode
INFO  : Completed executing 
command(queryId=hive_20230110210715_637ef126-bb53-4624-9a72-d36f13f98a93); Time 
taken: 0.507 seconds
INFO  : OK
No rows affected (0.809 seconds)
0: jdbc:hive2://hs2-qe-vw-dwx-hive-nnbm.dw-dw> show partitions test;
INFO  : Compiling 
command(queryId=hive_20230110210721_d1f38a5b-fe4e-4847-a3c2-5a85a95c29eb): show 
partitions test
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:partition, 
type:string, comment:from deserializer)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20230110210721_d1f38a5b-fe4e-4847-a3c2-5a85a95c29eb); Time 
taken: 0.03 seconds
INFO  : Executing 
command(queryId=hive_20230110210721_d1f38a5b-fe4e-4847-a3c2-5a85a95c29eb): show 
partitions test
INFO  : Starting task [Stage-0:DDL] in serial mode
ERROR : Failed
org.apache.hadoop.hive.ql.metadata.HiveException: Table test is not a 
partitioned table
    at 
org.apache.hadoop.hive.ql.ddl.table.partition.show.ShowPartitionsOperation.execute(ShowPartitionsOperation.java:44)
 ~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:360) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:333) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:250) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:111) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:809) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:547) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:541) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
~[hive-exec-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:232)
 ~[hive-service-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:89)
 ~[hive-service-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:338)
 ~[hive-service-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
    at javax.security.auth.Subject.doAs(Subject.java:423) ~[?:?]
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
 ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
    at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:358)
 ~[hive-service-3.1.3000.2022.0.13.0-72.jar:3.1.3000.2022.0.13.0-72]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) 
~[?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
    at

[jira] [Created] (HIVE-26925) MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due to column number/types difference.

2023-01-10 Thread Dharmik Thakkar (Jira)

Dharmik Thakkar created HIVE-26925:
--

 Summary: MV with iceberg storage format fails when contains 
'PARTITIONED ON' clause due to column number/types difference.
 Key: HIVE-26925
 URL: https://issues.apache.org/jira/browse/HIVE-26925
 Project: Hive
  Issue Type: Bug
  Components: Iceberg integration
Reporter: Dharmik Thakkar


MV with iceberg storage format fails when contains 'PARTITIONED ON' clause due 
to column number/types difference.
{code:java}
!!! annotations iceberg
>>> use iceberg_test_db_hive;
No rows affected
>>> set hive.exec.max.dynamic.partitions=2000;
>>> set hive.exec.max.dynamic.partitions.pernode=2000;
>>> drop materialized view if exists mv_agg_gby_col_partitioned;
>>> create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) 
>>> stored by iceberg stored as orc tblproperties ('format-version'='1') as 
>>> select b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t;
>>> analyze table mv_agg_gby_col_partitioned compute statistics for columns;
>>> set hive.explain.user=false;

>>> explain select b,f,sum(b) from all100k where t=93 group by c,v,f,b;
!!! match row_contains
  alias: iceberg_test_db_hive.mv_agg_gby_col_partitioned

>>> drop materialized view mv_agg_gby_col_partitioned;
 {code}
Error
{code:java}
2023-01-10T20:31:17,514 INFO  [pool-5-thread-1] jdbc.TestDriver: Query: create 
materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored by 
iceberg stored as orc tblproperties ('format-version'='1') as select 
b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t
2023-01-10T20:31:18,099 INFO  [Thread-21] jdbc.TestDriver: INFO  : Compiling 
command(queryId=hive_20230110203117_6c333b6a-1642-40e7-80bc-e78dede47980): 
create materialized view mv_agg_gby_col_partitioned PARTITIONED ON (t) stored 
by iceberg stored as orc tblproperties ('format-version'='1') as select 
b,f,sum(b), sum(f),t from all100k group by b,f,v,c,t
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: INFO  : No Stats for 
iceberg_test_db_hive@all100k, Columns: b, c, t, f, v
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: ERROR : FAILED: 
SemanticException Line 0:-1 Cannot insert into target table because column 
number/types are different 'TOK_TMP_FILE': Table insclause-0 has 6 columns, but 
query has 5 columns.
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver: 
org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Cannot insert into 
target table because column number/types are different 'TOK_TMP_FILE': Table 
insclause-0 has 6 columns, but query has 5 columns.
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:8905)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:8114)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:11583)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11455)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12424)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12290)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:13038)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:756)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13154)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:472)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:313)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:222)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201)
2023-01-10T20:31:18,100 INFO  [Thread-21] jdbc.TestDriver:     at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:657)
2023-01-10T20:31:18,100 INFO  [Thread-21]

[jira] [Created] (HIVE-26924) alter materialized view enable rewrite throws SemanticException for source iceberg table

2023-01-10 Thread Dharmik Thakkar (Jira)

Dharmik Thakkar created HIVE-26924:
--

 Summary: alter materialized view enable rewrite throws 
SemanticException for source iceberg table
 Key: HIVE-26924
 URL: https://issues.apache.org/jira/browse/HIVE-26924
 Project: Hive
  Issue Type: Bug
Reporter: Dharmik Thakkar


alter materialized view enable rewrite throws SemanticException for source 
iceberg table

SQL test
{code:java}
>>> create materialized view mv_rewrite as select t, si from all100k where 
>>> t>115;

>>> analyze table mv_rewrite compute statistics for columns;

>>> set hive.explain.user=false;

>>> explain select si,t from all100k where t>116 and t<120;
!!! match row_contains
  alias: iceberg_test_db_hive.mv_rewrite

>>> alter materialized view mv_rewrite disable rewrite;

>>> explain select si,t from all100k where t>116 and t<120;
!!! match row_contains
  alias: all100k

>>> alter materialized view mv_rewrite enable rewrite;

>>> explain select si,t from all100k where t>116 and t<120;
!!! match row_contains
  alias: iceberg_test_db_hive.mv_rewrite

>>> drop materialized view mv_rewrite; {code}
 

Error
{code:java}
2023-01-10T18:40:34,303 INFO  [pool-3-thread-1] jdbc.TestDriver: Query: alter 
materialized view mv_rewrite enable rewrite
2023-01-10T18:40:34,365 INFO  [Thread-10] jdbc.TestDriver: INFO  : Compiling 
command(queryId=hive_20230110184034_f557b4a6-40a0-42ba-8e67-2f273f50af36): 
alter materialized view mv_rewrite enable rewrite
2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver: ERROR : FAILED: 
SemanticException Automatic rewriting for materialized view cannot be enabled 
if the materialized view uses non-transactional tables
2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver: 
org.apache.hadoop.hive.ql.parse.SemanticException: Automatic rewriting for 
materialized view cannot be enabled if the materialized view uses 
non-transactional tables
2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rewrite.AlterMaterializedViewRewriteAnalyzer.analyzeInternal(AlterMaterializedViewRewriteAnalyzer.java:75)
2023-01-10T18:40:34,426 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:313)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:222)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:657)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:603)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:597)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127)
2023-01-10T18:40:34,427 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/java.security.AccessController.doPrivileged(Native Method)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/javax.security.auth.Subject.doAs(Subject.java:423)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:358)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2023-01-10T18:40:34,428 INFO  [Thread-10] jdbc.TestDriver:      at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2023-01-10T18:40:34,429 INFO

[jira] [Created] (HIVE-26923) Create Table As fails for iceberg when the columns are not specified.

2023-01-10 Thread Dharmik Thakkar (Jira)

Dharmik Thakkar created HIVE-26923:
--

 Summary: Create Table As fails for iceberg when the columns are 
not specified.
 Key: HIVE-26923
 URL: https://issues.apache.org/jira/browse/HIVE-26923
 Project: Hive
  Issue Type: Bug
  Components: Iceberg integration
Reporter: Dharmik Thakkar


Create Table As fails for iceberg when the columns are not defined.

Create table statement
{code:java}
create table xyz stored by iceberg stored by iceberg stored as orc 
TBLPROPERTIES ('format-version':'2') AS select name,age from studenttab10k 
group by name,age {code}
Error logs
{code:java}
2023-01-10T12:26:36,003 INFO  [pool-3-thread-1] jdbc.TestDriver: Query: create 
table xyz stored by iceberg stored by iceberg stored as orc TBLPROPERTIES 
('format-version':'2') AS select name,age from studenttab10k group by name,age
2023-01-10T12:26:36,094 INFO  [Thread-5] jdbc.TestDriver: INFO  : Compiling 
command(queryId=hive_20230110122636_8b4d7e37-c7b7-4554-a75a-51dfe37a4716): 
create table xyz stored by iceberg stored by iceberg stored as orc 
TBLPROPERTIES ('format-version':'2') AS select name,age from studenttab10k 
group by name,age
2023-01-10T12:26:36,095 INFO  [Thread-5] jdbc.TestDriver: ERROR : FAILED: 
ParseException line 1:42 mismatched input 'by' expecting AS near 'stored' in 
table file format specification
2023-01-10T12:26:36,095 INFO  [Thread-5] jdbc.TestDriver: 
org.apache.hadoop.hive.ql.parse.ParseException: line 1:42 mismatched input 'by' 
expecting AS near 'stored' in table file format specification
2023-01-10T12:26:36,095 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:127)
2023-01-10T12:26:36,095 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:93)
2023-01-10T12:26:36,095 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:85)
2023-01-10T12:26:36,096 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.Compiler.parse(Compiler.java:174)
2023-01-10T12:26:36,096 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:103)
2023-01-10T12:26:36,096 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:201)
2023-01-10T12:26:36,096 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:657)
2023-01-10T12:26:36,096 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:603)
2023-01-10T12:26:36,097 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:597)
2023-01-10T12:26:36,097 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127)
2023-01-10T12:26:36,097 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
2023-01-10T12:26:36,097 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
2023-01-10T12:26:36,097 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.security.AccessController.doPrivileged(Native Method)
2023-01-10T12:26:36,097 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/javax.security.auth.Subject.doAs(Subject.java:423)
2023-01-10T12:26:36,098 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
2023-01-10T12:26:36,098 INFO  [Thread-5] jdbc.TestDriver:   at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:358)
2023-01-10T12:26:36,098 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
2023-01-10T12:26:36,098 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2023-01-10T12:26:36,099 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
2023-01-10T12:26:36,099 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
2023-01-10T12:26:36,099 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2023-01-10T12:26:36,099 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2023-01-10T12:26:36,099 INFO  [Thread-5] jdbc.TestDriver:   at 
java.base/java.lang.Thread.run(Thread.java:829)
2023-01-10T12:26:36,099 INFO  [Thread-5] jdbc.TestDriver: 
 {code}
However for Hive the below query

Re: [EXTERNAL] Improving branch-3 Build Times

2023-01-10 Thread Aman Raj

This looks great Chris. You can raise a PR and we can start discussions on the 
same.

Thanks,
Aman.

From: Chris Nauroth 
Sent: Saturday, January 7, 2023 5:29 AM
To: dev 
Subject: [EXTERNAL] Improving branch-3 Build Times

For those of you working on both master and branch-3, you may have noticed
that it takes a crazy long time to complete a full build on branch-3. In my
environment, running this...

mvn -B -T 8 -Pitests clean install -DskipTests

...it completes in ~8 minutes on master, but ~40 minutes on branch-3. The
long haul is modules like standalone-metastore that use
maven-assembly-plugin to build the artifact. At least part of the root
cause is that branch-3 is using an older version of maven-assembly-plugin
with sub-optimal file system access patterns.

I have a work in progress patch here:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Fpull%2F3924=05%7C01%7Crajaman%40microsoft.com%7Caa65b42ed50340db79c808daf042201d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638086464128742622%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=8gsYRq5PZdLQBRjkgM8J7eOluNeQz56sKMmGOSjrETI%3D=0

This is just a draft, not ready for review, but so far I've been able to
cut my branch-3 build times back down to ~13 minutes. I might not be able
to polish this off and request review until late next week, so I'm sharing
it early in case it makes life easier for anyone else on branch-3.

Chris Nauroth

[jira] [Created] (HIVE-26922) Deadlock when rebuilding Materialized view stored by Iceberg

2023-01-10 Thread Krisztian Kasa (Jira)

Krisztian Kasa created HIVE-26922:
-

 Summary: Deadlock when rebuilding Materialized view stored by 
Iceberg
 Key: HIVE-26922
 URL: https://issues.apache.org/jira/browse/HIVE-26922
 Project: Hive
  Issue Type: Bug
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa


{code}
create table tbl_ice(a int, b string, c int) stored by iceberg stored as orc 
tblproperties ('format-version'='1');
insert into tbl_ice values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), 
(4, 'four', 53), (5, 'five', 54);

create materialized view mat1 stored by iceberg stored as orc tblproperties 
('format-version'='1') as
select tbl_ice.b, tbl_ice.c from tbl_ice where tbl_ice.c > 52;

insert into tbl_ice values (10, 'ten', 60);

alter materialized view mat1 rebuild;
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26921) Add failover_type, failover_endpoint to replication metrics metadata

2023-01-09 Thread Amit Saonerkar (Jira)

Amit Saonerkar created HIVE-26921:
-

 Summary: Add failover_type, failover_endpoint to replication 
metrics metadata
 Key: HIVE-26921
 URL: https://issues.apache.org/jira/browse/HIVE-26921
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Amit Saonerkar
Assignee: Amit Saonerkar


Corresponding to CDPD-46494



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26920) Add-new-view-in-sys-db-to-capture-failover-and-failback-metrics

2023-01-09 Thread Amit Saonerkar (Jira)

Amit Saonerkar created HIVE-26920:
-

 Summary: 
Add-new-view-in-sys-db-to-capture-failover-and-failback-metrics
 Key: HIVE-26920
 URL: https://issues.apache.org/jira/browse/HIVE-26920
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Reporter: Amit Saonerkar
Assignee: Amit Saonerkar


Corresponding to CDPD-46702



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26919) Fix-test-case-TestReplicationOptimisedBootstrap.testReverseFailoverBeforeOptimizedBootstrap

2023-01-09 Thread Amit Saonerkar (Jira)

Amit Saonerkar created HIVE-26919:
-

 Summary: 
Fix-test-case-TestReplicationOptimisedBootstrap.testReverseFailoverBeforeOptimizedBootstrap
 Key: HIVE-26919
 URL: https://issues.apache.org/jira/browse/HIVE-26919
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Amit Saonerkar
Assignee: Amit Saonerkar


This Jira is related to test case failure corresponding to Jira CDPD-48053



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26918) Upgrade jamon-runtime to 2.4.1

2023-01-09 Thread Dongjoon Hyun (Jira)

Dongjoon Hyun created HIVE-26918:


 Summary: Upgrade jamon-runtime to 2.4.1
 Key: HIVE-26918
 URL: https://issues.apache.org/jira/browse/HIVE-26918
 Project: Hive
  Issue Type: Bug
  Components: Web UI
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26917) Upgrade parquet to 1.12.3

2023-01-09 Thread Rajesh Balamohan (Jira)

Rajesh Balamohan created HIVE-26917:
---

 Summary: Upgrade parquet to 1.12.3
 Key: HIVE-26917
 URL: https://issues.apache.org/jira/browse/HIVE-26917
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26916) Disable TestJdbcGenericUDTFGetSplits (Done as part of HIVE-22942)

2023-01-09 Thread Aman Raj (Jira)

Aman Raj created HIVE-26916:
---

 Summary: Disable TestJdbcGenericUDTFGetSplits (Done as part of 
HIVE-22942)
 Key: HIVE-26916
 URL: https://issues.apache.org/jira/browse/HIVE-26916
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26915) Backport of HIVE-23692 TestCodahaleMetrics.testFileReporting is flaky

2023-01-09 Thread Aman Raj (Jira)

Aman Raj created HIVE-26915:
---

 Summary: Backport of HIVE-23692 
TestCodahaleMetrics.testFileReporting is flaky
 Key: HIVE-26915
 URL: https://issues.apache.org/jira/browse/HIVE-26915
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26914) Upgrade postgresql to 42.5.1 due to CVE-2022-41946

2023-01-09 Thread Devaspati Krishnatri (Jira)

Devaspati Krishnatri created HIVE-26914:
---

 Summary: Upgrade postgresql to 42.5.1 due to CVE-2022-41946
 Key: HIVE-26914
 URL: https://issues.apache.org/jira/browse/HIVE-26914
 Project: Hive
  Issue Type: Task
Reporter: Devaspati Krishnatri






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Proposal: Revamp Apache Hive website.

2023-01-09 Thread Stamatis Zampetakis

Hi everyone,

Simhadri has been working hard to modernize the Hive website (HIVE-26565)
for the past few months and I am quite happy with the results.

I reviewed the respective PR [1] and will commit the changes in 24h unless
there are objections.

Best,
Stamatis

[1] https://github.com/apache/hive-site/pull/2

On Wed, Oct 5, 2022 at 8:46 PM Simhadri G  wrote:

> Thanks for the feedback Stamatis !
>
>- I have updated the PR to include a README.md file with instructions
>to build and view the site locally after making any new changes. This will
>help us preview the changes locally before pushing the commit. (Docker is
>not required here.)
>
>- Github pages was used to share the new website with the community
>and it will most likely not be necessary later on.
>
>- Regarding the role of Github Actions(gh-pages.yml):
>
>- Whenever a PR is merged to the main branch, a github action is
>   triggered .
>   - Github action will install a hugo and build the site with the new
>   changes.  Once the build is successful, HUGO then generates a set of 
> static
>   files and these files are automatically merged to the hive-site/asf-site
>   branch by github actions bot.
>   - From here, to publish  hive-site/asf-site to project web site
>   sub-domain (hive.apache.org),  we need to set up a configuration
>   block called publish in your .asf.yaml file. (
>   
> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Publishingabranchtoyourprojectwebsite).
>
>   - We will need help from apache infra - gmcdonald
>    or
>   Humbedooh
>    to
>   make sure that we have set this up correctly.
>
>   - I agree with your suggestion to keep the changes around the
>revamp as minimal as possible and not mix the content update with the
>framework change. In this case, we can make the other changes incrementally
>at a later stage.
>
>
> Thanks!
> Simhadri G
>
> On Wed, Oct 5, 2022 at 3:41 PM Stamatis Zampetakis 
> wrote:
>
>> Thanks for staying on top of this Simhadri.
>>
>> I will try to help reviewing the PR once I get some time.
>>
>> What is not yet clear to me from this discussion or by looking at the PR
>> is the workflow for making a change appear on the web (
>> https://hive.apache.org/). Having a README which clearly states what
>> needs to be done is a must.
>>
>> I also think it is quite important to have instructions and possibly
>> docker images for someone to be able to test how the changes look locally
>> before commiting a change to the repo.
>>
>> Another point that needs clarification is the role of github pages. I am
>> not sure why it is necessary at the moment and what exactly is the plan
>> going forward. If I understand well, currently it is used to preview the
>> changes but from my perspective we shouldn't need to commit something to
>> the repo to understand if something breaks or not; preview should happen
>> locally.
>>
>> I would suggest to keep the changes around the revamp as minimal as
>> possible and not mix the content update with the framework change. As
>> usual, smaller changes are easier to review and merge. It is definitely
>> worth updating and improving the content but let's do it incrementally so
>> that changes can get merged faster.
>>
>> The list of committers and PMC members for Hive can be found in the
>> apache phonebook [1]. The list can easily get outdated so maybe we can
>> consider adding links to [1] and/or github and other places instead of
>> duplicating the content. Anyways, let's first deal with the revamp and
>> discuss content changes later in separate JIRAs/PRs.
>>
>> Best,
>> Stamatis
>>
>> [1] https://home.apache.org/phonebook.html?project=hive
>>
>> On Sun, Oct 2, 2022 at 2:41 AM Simhadri G  wrote:
>>
>>> Hello Everyone,
>>>
>>> I have raised the PR for the revamped Hive Website here:
>>>  https://github.com/apache/hive-site/pull/2
>>>
>>> I kindly request if someone can help review this PR .
>>>
>>> Until the PR is merged, you can find the updated website here . Please
>>> have a look and any feedback is most welcome :)
>>> https://simhadri-g.github.io/hive-site/
>>>
>>> Few other things to note:
>>>
>>>- We will need help from someone who has write access to hive-site
>>>repo to update the github workflow once PR is merged.
>>>- One more important question, I came across this (
>>>https://hive.apache.org/people.html ) page, while moving the .md
>>>file to the new website, which lists the current pmc and committers of
>>>hive. I noticed that this list is not upto date, a lot of people seem to 
>>> be
>>>missing from this list. May I please know where I can find the updated 
>>> date
>>>list of committers and PMCs which I can refer to and update the page.
>>>- Lastly,

[jira] [Created] (HIVE-26913) HiveVectorizedReader::parquetRecordReader should reuse footer information

2023-01-09 Thread Rajesh Balamohan (Jira)

Rajesh Balamohan created HIVE-26913:
---

 Summary: HiveVectorizedReader::parquetRecordReader should reuse 
footer information
 Key: HIVE-26913
 URL: https://issues.apache.org/jira/browse/HIVE-26913
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: Rajesh Balamohan


HiveVectorizedReader::parquetRecordReader should reuse details of parquet 
footer, instead of reading it again.

 

It reads parquet footer here:

[https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L230-L232]

Again it reads the footer here for constructing vectorized recordreader

[https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L249]

 

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java#L50]

 

Check the codepath of 
VectorizedParquetRecordReader::setupMetadataAndParquetSplit

[https://github.com/apache/hive/blob/6b0139188aba6a95808c8d1bec63a651ec9e4bdc/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L180]

 

It should be possible to share "ParquetMetadata" in 
VectorizedParuqetRecordReader.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26912) Publish SBOM artifacts

2023-01-09 Thread Dongjoon Hyun (Jira)

Dongjoon Hyun created HIVE-26912:


 Summary: Publish SBOM artifacts
 Key: HIVE-26912
 URL: https://issues.apache.org/jira/browse/HIVE-26912
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Improving branch-3 Build Times

2023-01-06 Thread Chris Nauroth

For those of you working on both master and branch-3, you may have noticed
that it takes a crazy long time to complete a full build on branch-3. In my
environment, running this...

mvn -B -T 8 -Pitests clean install -DskipTests

...it completes in ~8 minutes on master, but ~40 minutes on branch-3. The
long haul is modules like standalone-metastore that use
maven-assembly-plugin to build the artifact. At least part of the root
cause is that branch-3 is using an older version of maven-assembly-plugin
with sub-optimal file system access patterns.

I have a work in progress patch here:

https://github.com/apache/hive/pull/3924

This is just a draft, not ready for review, but so far I've been able to
cut my branch-3 build times back down to ~13 minutes. I might not be able
to polish this off and request review until late next week, so I'm sharing
it early in case it makes life easier for anyone else on branch-3.

Chris Nauroth

[jira] [Created] (HIVE-26911) Renaming a translated external table with a specified location fails with 'location already exists' exception

2023-01-05 Thread Sai Hemanth Gantasala (Jira)

Sai Hemanth Gantasala created HIVE-26911:


 Summary: Renaming a translated external table with a specified 
location fails with 'location already exists' exception
 Key: HIVE-26911
 URL: https://issues.apache.org/jira/browse/HIVE-26911
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Sai Hemanth Gantasala
Assignee: Sai Hemanth Gantasala


Renaming a translated external table with a specified location fails with 
'location already exists' exception.
Below are steps for repro
{code:java}
create database tmp;
use tmp;
create table b(s string) stored as parquet location 
'hdfs://localhost:20500/test-warehouse/tmp.db/some_location';
alter table b rename to bb;
ERROR: InvalidOperationException: New location for this table hive.tmp.bb 
already exists : hdfs://localhost:20500/test-warehouse/tmp.db/some_location 
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26910) Backport HIVE-19104: Use independent warehouse directories in test metastores.

2023-01-05 Thread Chris Nauroth (Jira)

Chris Nauroth created HIVE-26910:


 Summary: Backport HIVE-19104: Use independent warehouse 
directories in test metastores.
 Key: HIVE-26910
 URL: https://issues.apache.org/jira/browse/HIVE-26910
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Chris Nauroth
Assignee: Chris Nauroth


{{TestHS2ImpersonationWithRemoteMS}} fails on branch-3. It makes assertions 
about the state of the warehouse directory, but it doesn't account for a part 
of metastore initialization that updates the warehouse directory to 
parameterize it by port number for test isolation.

{{MetaStoreTestUtils#startMetaStoreWithRetry}} sets the warehouse directory as 
the new {{metastore.warehouse.dir}} property. 
{{AbstractHiveService#get/setWareHouseDir}} later works with the deprecated 
{{hive.metastore.warehouse.dir}} property. {{MetastoreConf}} will take care of 
resolving requests for the new property to values under the old property, but 
not vice versa.

On master, HIVE-19104 included an additional line in {{MiniHs2}} to make sure 
these 2 properties would stay in sync for test runs. This issue tracks a 
slightly modified backport of that patch to branch-3.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26909) Backport of HIVE-20715: Disable test: udaf_histogram_numeric

2023-01-04 Thread Aman Raj (Jira)

Aman Raj created HIVE-26909:
---

 Summary: Backport of HIVE-20715: Disable test:  
udaf_histogram_numeric
 Key: HIVE-26909
 URL: https://issues.apache.org/jira/browse/HIVE-26909
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26908) Disable Initiator on HMS instance at the same time enable Cleaner thread

2023-01-04 Thread Taraka Rama Rao Lethavadla (Jira)

Taraka Rama Rao Lethavadla created HIVE-26908:
-

 Summary: Disable Initiator on HMS instance at the same time enable 
Cleaner thread
 Key: HIVE-26908
 URL: https://issues.apache.org/jira/browse/HIVE-26908
 Project: Hive
  Issue Type: New Feature
  Components: Standalone Metastore
Reporter: Taraka Rama Rao Lethavadla
Assignee: Taraka Rama Rao Lethavadla


In the current implementation, both Initiator and Cleaner are either enabled or 
disabled using the same config 
{noformat}
hive.compactor.initiator.on{noformat}
So there is no way to selectively disable initiator and enable cleaner or vice 
versa.

Introducing another config to handle Cleaner thread alone like 
hive.compactor.cleaner.on



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26907) Backport of HIVE-20741: Disable udaf_context_ngrams.q and udaf_corr.q tests

2023-01-04 Thread Aman Raj (Jira)

Aman Raj created HIVE-26907:
---

 Summary: Backport of HIVE-20741: Disable udaf_context_ngrams.q and 
udaf_corr.q tests
 Key: HIVE-26907
 URL: https://issues.apache.org/jira/browse/HIVE-26907
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26906) Backport of HIVE-19313 to branch-3 : TestJdbcWithDBTokenStoreNoDoAs tests are failing

2023-01-04 Thread Aman Raj (Jira)

Aman Raj created HIVE-26906:
---

 Summary: Backport of HIVE-19313 to branch-3 : 
TestJdbcWithDBTokenStoreNoDoAs tests are failing
 Key: HIVE-26906
 URL: https://issues.apache.org/jira/browse/HIVE-26906
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


# org.apache.hive.minikdc.TestJdbcWithDBTokenStoreNoDoAs
ERROR :
{{Error Could not open client transport with JDBC Uri: 
jdbc:hive2://localhost:42959/default;auth=delegationToken: Peer indicated 
failure: DIGEST-MD5: IO error acquiring password Stacktrace 
java.sql.SQLException: Could not open client transport with JDBC Uri: 
jdbc:hive2://localhost:42959/default;auth=delegationToken: Peer indicated 
failure: DIGEST-MD5: IO error acquiring password at 
org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:269) at 
org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) at 
java.sql.DriverManager.getConnection(DriverManager.java:664) at 
java.sql.DriverManager.getConnection(DriverManager.java:270) at 
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testTokenAuth(TestJdbcWithMiniKdc.java:172)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26905) Backport HIVE-25173 to 3.2.0: Exclude pentaho-aggdesigner-algorithm from upgrade-acid build.

2023-01-04 Thread Chris Nauroth (Jira)

Chris Nauroth created HIVE-26905:


 Summary: Backport HIVE-25173 to 3.2.0: Exclude 
pentaho-aggdesigner-algorithm from upgrade-acid build.
 Key: HIVE-26905
 URL: https://issues.apache.org/jira/browse/HIVE-26905
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Chris Nauroth
Assignee: Chris Nauroth


In the current branch-3, upgrade-acid has a dependency on an old hive-exec 
version that has a transitive dependency to 
org.pentaho:pentaho-aggdesigner-algorithm. This artifact is no longer available 
in commonly supported Maven repositories, which causes a build failure. We can 
safely exclude the dependency, as was originally done in HIVE-25173.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26904) QueryCompactor failed in commitCompaction if the tmp table dir is already removed

2023-01-04 Thread Quanlong Huang (Jira)

Quanlong Huang created HIVE-26904:
-

 Summary: QueryCompactor failed in commitCompaction if the tmp 
table dir is already removed 
 Key: HIVE-26904
 URL: https://issues.apache.org/jira/browse/HIVE-26904
 Project: Hive
  Issue Type: Bug
Reporter: Quanlong Huang
Assignee: Quanlong Huang


commitCompaction() of query-based compactions just remove the dirs of tmp 
tables. It should not fail the compaction if the dirs are already removed.

We've seen such a failure in Impala's test (IMPALA-11756):
{noformat}
2023-01-02T02:09:26,306  INFO [HiveServer2-Background-Pool: Thread-695] 
ql.Driver: Executing 
command(queryId=jenkins_20230102020926_69112755-b783-4214-89e5-1c7111dfe15f): 
alter table partial_catalog_info_test.insert_only_partitioned partition 
(part=1) compact 'minor' and wait
2023-01-02T02:09:26,306  INFO [HiveServer2-Background-Pool: Thread-695] 
ql.Driver: Starting task [Stage-0:DDL] in serial mode
2023-01-02T02:09:26,317  INFO [HiveServer2-Background-Pool: Thread-695] 
exec.Task: Compaction enqueued with id 15
...
2023-01-02T02:12:55,849 ERROR 
[impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] 
compactor.Worker: Caught exception while trying to compact 
id:15,dbname:partial_catalog_info_test,tableName:insert_only_partitioned,partName:part=1,state:^@,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:jenkins,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId:
 null,initiatorId: null,retryRetention0. Marking failed to avoid repeated 
failures
java.io.FileNotFoundException: File 
hdfs://localhost:20500/tmp/hive/jenkins/092b533a-81c8-4b95-88e4-9472cf6f365d/_tmp_space.db/62ec04fb-e2d2-4a99-a454-ae709a3cccfe
 does not exist.
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 ~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
 ~[hadoop-hdfs-client-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144) 
~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
        at org.apache.hadoop.fs.FileSystem$5.(FileSystem.java:2302) 
~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
        at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:2299) 
~[hadoop-common-3.1.1.7.2.15.4-6.jar:?]
        at 
org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor$Util.cleanupEmptyDir(QueryCompactor.java:261)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
        at 
org.apache.hadoop.hive.ql.txn.compactor.MmMinorQueryCompactor.commitCompaction(MmMinorQueryCompactor.java:72)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
        at 
org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:146)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
        at 
org.apache.hadoop.hive.ql.txn.compactor.MmMinorQueryCompactor.runCompaction(MmMinorQueryCompactor.java:63)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
        at 
org.apache.hadoop.hive.ql.txn.compactor.Worker.findNextCompactionAndExecute(Worker.java:435)
 ~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
        at 
org.apache.hadoop.hive.ql.txn.compactor.Worker.lambda$run$0(Worker.java:115) 
~[hive-exec-3.1.3000.2022.0.13.0-60.jar:3.1.3000.2022.0.13.0-60]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_261]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_261]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_261]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
2023-01-02T02:12:55,858  INFO 
[impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] 
compactor.Worker: Deleting result directories created by the 
compactor:2023-01-02T02:12:55,858  INFO 
[impala-ec2-centos79-m6i-4xlarge-ondemand-1428.vpc.cloudera.com-48_executor] 
compactor.Worker: 
hdfs://localhost:20500/test-warehouse/managed/partial_catalog_info_test.db/insert_only_partitioned/part=1/delta_001_003_v0001827
2023-01-02T02:12:55,859  INFO

[jira] [Created] (HIVE-26903) Compactor threads should gracefully shutdown

2023-01-03 Thread Jira

László Végh created HIVE-26903:
--

 Summary: Compactor threads should gracefully shutdown
 Key: HIVE-26903
 URL: https://issues.apache.org/jira/browse/HIVE-26903
 Project: Hive
  Issue Type: Improvement
Reporter: László Végh


Currently the compactor threads are daemon threads, which means the JVM will 
not wait for these threads to finish. (see: 
[https://github.com/apache/hive/blob/431e7d9e5431a808106d8db81e11aea74f040da5/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorThread.java#L81)]
 As a result during system shutdown, JVM may close all daemon threads abruptly 
(JVM won't wait for a thread to sleep/wait, and no InterruptedException is 
thrown), so the threads don't have any chance to shutdown gracefully. This can 
lead to inconsistent/corrupted state in the Metastore or on the File system.

Make the compactor threads user threads, and handle shutdown accordingly. Make 
sure interrupts and InterruptedException is handled accordingly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26902) Failed to close AbstractFileMergeOperator

2023-01-03 Thread zhenkuan_zhang (Jira)

zhenkuan_zhang created HIVE-26902:
-

 Summary: Failed to close AbstractFileMergeOperator
 Key: HIVE-26902
 URL: https://issues.apache.org/jira/browse/HIVE-26902
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 3.1.2
 Environment: hadoop：3.2.1
hive：3.1.2
spark：2.4.6
hive on spark
Reporter: zhenkuan_zhang


when i set hive.merge.sparkfiles to true.Sometimes an error is reported when 
SQL is running。The error log is as follows：

org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close 
AbstractFileMergeOperator
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:115)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:96)
at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
at scala.collection.Iterator$class.foreach(Iterator.scala:891)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
at 
org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:2212)
at 
org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:2212)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at 
org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close 
AbstractFileMergeOperator
at 
org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:315)
at 
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:265)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:113)
... 17 more
Caused by: java.io.IOException: Unable to rename 
hdfs://olapCluster/user/hive/warehouse/bi_dw.db/kpy_sfc_fyd_parts_d74_hour_temp/.hive-staging_hive_2023-01-03_13-15-16_144_4347904191947316325-50073/_task_tmp.-ext-1/_tmp.03_0
 to 
hdfs://olapCluster/user/hive/warehouse/bi_dw.db/sfc__temp/.hive-staging_hive_2023-01-03_13-15-16_144_4347904191947316325-50073/_tmp.-ext-1/03_0
at 
org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:254)
... 19 more








--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26901) Add metrics on transactions in replication metrics table

2023-01-02 Thread Amit Saonerkar (Jira)

Amit Saonerkar created HIVE-26901:
-

 Summary: Add metrics on transactions in replication metrics table 
 Key: HIVE-26901
 URL: https://issues.apache.org/jira/browse/HIVE-26901
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Amit Saonerkar
Assignee: Amit Saonerkar


This is related to corresponding 
[https://jira.cloudera.com/browse/CDPD-17985?filter=-1]

We need to enahnce replication metrics table information by adding informations 
related to transactions during REPL DUMP/LOAD operations. Basically idea here 
is to give user a picture about how transaction are making progress during dump 
and load operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26900) Error message not representing the correct line number with a syntax error in a HQL File

2023-01-02 Thread Vikram Ahuja (Jira)

Vikram Ahuja created HIVE-26900:
---

 Summary: Error message not representing the correct line number 
with a syntax error in a HQL File
 Key: HIVE-26900
 URL: https://issues.apache.org/jira/browse/HIVE-26900
 Project: Hive
  Issue Type: Bug
Reporter: Vikram Ahuja


When a wrong syntax is added in a HQL file, the error thrown by beeline while 
running the HQL file is having the wrong line number.  The line number and even 
the position is incorrect. Seems like parser is not considering spaces and new 
lines and always throwing the error on line number 1 irrespective of what line 
the error is on in the HQL file



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26899) Upgrade arrow to 0.11.0 in branch-3

2023-01-02 Thread Aman Raj (Jira)

Aman Raj created HIVE-26899:
---

 Summary: Upgrade arrow to 0.11.0 in branch-3
 Key: HIVE-26899
 URL: https://issues.apache.org/jira/browse/HIVE-26899
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26898) Split Notification logging so that we can busy clusters can have better performance

2023-01-02 Thread Taraka Rama Rao Lethavadla (Jira)

Taraka Rama Rao Lethavadla created HIVE-26898:
-

 Summary: Split Notification logging so that we can busy clusters 
can have better performance
 Key: HIVE-26898
 URL: https://issues.apache.org/jira/browse/HIVE-26898
 Project: Hive
  Issue Type: New Feature
Reporter: Taraka Rama Rao Lethavadla


For DDL & DML events are logged into notifications log table and those get 
cleaned as soon as ttl got expired.

In most of the busy clusters, the notification log is growing even though 
cleaner is running and kept on cleaning the events. It means the rate of Hive 
db operations are very high compared to rate at which cleaning is happening.

So any query on this table is becoming bottle neck at backend DB causing slow 
response

The proposal is to split the notification log table in to multiple tables like 

notification_log_dml - for all DML queries

notification_log_insert - for all insert queries

..

etc.

 

So that load on that single table gets reduced improving the performance of the 
backend db as well as Hive



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26897) Provide a command/tool to recover data in ACID table when table data got corrupted with invalid/junk delta/delete_delta folders

2023-01-02 Thread Taraka Rama Rao Lethavadla (Jira)

Taraka Rama Rao Lethavadla created HIVE-26897:
-

 Summary: Provide a command/tool to recover data in ACID table when 
table data got corrupted with invalid/junk delta/delete_delta folders 
 Key: HIVE-26897
 URL: https://issues.apache.org/jira/browse/HIVE-26897
 Project: Hive
  Issue Type: New Feature
Reporter: Taraka Rama Rao Lethavadla


Example: A table has below directories
{noformat}
drwx-- - hive hive 0 2022-11-05 19:43 
/data/warehouse/tbl/delete_delta_0080483_0087704_v0973185
drwx-- - hive pdl_prod_nosh_jsin 0 2022-12-05 00:18 
/data/warehouse/tbl/delete_delta_0080483_0088384_v507{noformat}
When we read data from this table, we get below errors
{noformat}
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: 
Duplicate key null (attempted merging values 
org.apache.hadoop.hive.ql.io.AcidInputFormat$DeltaFileMetaData@41776cd9 and 
org.apache.hadoop.hive.ql.io.AcidInputFormat$DeltaFileMetaData@1404a054){noformat}
delete_delta_0080483_0087704_v0973185,delete_delta_0080483_0088384_v507 are 
created as part of minor compaction. In general, once minor compaction 
completed, the next minor compaction picks min_writeId value as greater than 
the value of the previously compacted max_writeId. In this case for both the 
minor compacted directories could see min_writeId is the same (i.e. 0080483).

To mitigate the issue, we had to remove those directories manually from hdfs, 
then create a fresh table out of it, drop the actual table and rename fresh 
table to actual table

*Proposal*

Create a tool/command to read the data from the corrupted ACID table to recover 
data out of it before we make any changes to the underlying data. So that we 
can workaround the problem by creating another table with same data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26896) Backport of Test fixes for lineage3.q and load_static_ptn_into_bucketed_table.q

2023-01-02 Thread Aman Raj (Jira)

Aman Raj created HIVE-26896:
---

 Summary: Backport of Test fixes for lineage3.q and 
load_static_ptn_into_bucketed_table.q
 Key: HIVE-26896
 URL: https://issues.apache.org/jira/browse/HIVE-26896
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


These tests were fixed in branch-3.1 so backporting them to branch-3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26895) Backport of HIVE-22899: Make sure qtests clean up copied files from test directories

2023-01-01 Thread Aman Raj (Jira)

Aman Raj created HIVE-26895:
---

 Summary: Backport of HIVE-22899: Make sure qtests clean up copied 
files from test directories
 Key: HIVE-26895
 URL: https://issues.apache.org/jira/browse/HIVE-26895
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


Tests (like avrotblsjoin.q) are failing due to the following errors:
{code:java}
Begin query: avrotblsjoin.qTRACE StatusLogger Log4jLoggerFactory.getContext() 
found anchor class org.apache.hadoop.hive.cli.CliDriverTRACE StatusLogger 
Log4jLoggerFactory.getContext() found anchor class 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzerTRACE StatusLogger 
Log4jLoggerFactory.getContext() found anchor class 
org.apache.curator.RetryLoopTRACE StatusLogger Log4jLoggerFactory.getContext() 
found anchor class org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzercp: 
`/home/jenkins/agent/workspace/hive-precommit_PR-3859/itests/qtest/target/tmp/table1.avsc':
 File existsDone query avrotblsjoin.q. succeeded=false, skipped=false. 
ElapsedTime(ms)=41TRACE StatusLogger Log4jLoggerFactory.getContext() found 
anchor class org.apache.curator.RetryLoop {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26894) After using scratchdir for staging final job, CTAS and IOW on ACID tables are failing with wrongFS exception

2022-12-30 Thread Sruthi M (Jira)

Sruthi M created HIVE-26894:
---

 Summary: After using scratchdir for staging final job, CTAS and 
IOW on ACID tables are failing with wrongFS exception
 Key: HIVE-26894
 URL: https://issues.apache.org/jira/browse/HIVE-26894
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Sruthi M


ERROR : Failed with exception Wrong FS: 
abfs:///hive/warehouse/managed/tpcds_orc.db/test_sales/delta_001_001_,
 expected: hdfs://mycluster



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26893) Extend get partitions APIs to ignore partition schemas

2022-12-28 Thread Quanlong Huang (Jira)

Quanlong Huang created HIVE-26893:
-

 Summary: Extend get partitions APIs to ignore partition schemas
 Key: HIVE-26893
 URL: https://issues.apache.org/jira/browse/HIVE-26893
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Quanlong Huang


There are several HMS APIs that return a list of partitions, e.g. 
get_partitions_ps(), get_partitions_by_names(), add_partitions_req() with 
needResult=true, etc. Each partition instance will have a unique list of 
FieldSchemas as the partition schema:
{code:java}
org.apache.hadoop.hive.metastore.api.Partition
-> org.apache.hadoop.hive.metastore.api.StorageDescriptor
   ->  cols: list {code}
This could occupy a large memory footprint for wide tables (e.g. with 2k cols). 
See the heap histogram in IMPALA-11812 as an example.

Some engines like Impala doesn't actually use/respect the partition level 
schema. It's a waste of network/serde resource to transmit them. It'd be nice 
if these APIs provide an optional boolean flag for ignoring partition schemas. 
So HMS clients (e.g. Impala) don't need to clear them later (to save mem).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26892) Backport HIVE-25243: Handle nested values in null struct.

2022-12-28 Thread Chris Nauroth (Jira)

Chris Nauroth created HIVE-26892:


 Summary: Backport HIVE-25243: Handle nested values in null struct.
 Key: HIVE-26892
 URL: https://issues.apache.org/jira/browse/HIVE-26892
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers
Reporter: Chris Nauroth
Assignee: Chris Nauroth


On branch-3, we've seen a failure in {{TestArrowColumnarBatchSerDe}} while 
trying to serialize a row of null values. It fails while trying to serialize 
the fields of a null struct. This was fixed in 4.0 by HIVE-25243. This issue 
tracks a backport to branch-3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26891) Fix TestArrowColumnarBatchSerDe test failures in branch-3

2022-12-27 Thread Raghav Aggarwal (Jira)

Raghav Aggarwal created HIVE-26891:
--

 Summary: Fix TestArrowColumnarBatchSerDe test failures in branch-3
 Key: HIVE-26891
 URL: https://issues.apache.org/jira/browse/HIVE-26891
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Raghav Aggarwal
Assignee: Raghav Aggarwal






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26890) Disable TestSSL (Done as part of HIVE-21456 in oss/master)

2022-12-27 Thread Aman Raj (Jira)

Aman Raj created HIVE-26890:
---

 Summary: Disable TestSSL (Done as part of HIVE-21456 in oss/master)
 Key: HIVE-26890
 URL: https://issues.apache.org/jira/browse/HIVE-26890
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


TestSSL fails with the following error (this happens in the Hive-3.1.3 release 
also, so disabling this test) :
{code:java}
[ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 23.143 
s <<< FAILURE! - in org.apache.hive.jdbc.TestSSL
[ERROR] testConnectionWrongCertCN(org.apache.hive.jdbc.TestSSL)  Time elapsed: 
0.64 s  <<< FAILURE!
java.lang.AssertionError
        at org.junit.Assert.fail(Assert.java:86)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at org.junit.Assert.assertTrue(Assert.java:52)
        at 
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN(TestSSL.java:408)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
        at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
        at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26889) Implement array_join udf to concatenate the elements of an array with a specified delimiter

2022-12-24 Thread Taraka Rama Rao Lethavadla (Jira)

Taraka Rama Rao Lethavadla created HIVE-26889:
-

 Summary: Implement array_join udf to concatenate the elements of 
an array with a specified delimiter
 Key: HIVE-26889
 URL: https://issues.apache.org/jira/browse/HIVE-26889
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Taraka Rama Rao Lethavadla






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26888) Hive gives empty results with partition column filter for hive parquet table when data loaded through spark dataframe

2022-12-23 Thread Indhumathi Muthumurugesh (Jira)

Indhumathi Muthumurugesh created HIVE-26888:
---

 Summary: Hive gives empty results with partition column filter for 
hive parquet table when data loaded through spark dataframe
 Key: HIVE-26888
 URL: https://issues.apache.org/jira/browse/HIVE-26888
 Project: Hive
  Issue Type: Bug
Reporter: Indhumathi Muthumurugesh






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26887) Make sure cacheDirPath in QueryResultsCache has the correct permissions

2022-12-23 Thread Zhang Dongsheng (Jira)

Zhang Dongsheng created HIVE-26887:
--

 Summary: Make sure cacheDirPath in QueryResultsCache has the 
correct permissions
 Key: HIVE-26887
 URL: https://issues.apache.org/jira/browse/HIVE-26887
 Project: Hive
  Issue Type: Improvement
Reporter: Zhang Dongsheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26886) Backport of HIVE-HIVE-23621 Enforce ASF headers on source files

2022-12-22 Thread Aman Raj (Jira)

Aman Raj created HIVE-26886:
---

 Summary: Backport of HIVE-HIVE-23621 Enforce ASF headers on source 
files
 Key: HIVE-26886
 URL: https://issues.apache.org/jira/browse/HIVE-26886
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


Cherry pick this commit to branch-3



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26885) Iceberg: Parquet Vectorized V2 reads fails with NPE

2022-12-22 Thread Ayush Saxena (Jira)

Ayush Saxena created HIVE-26885:
---

 Summary: Iceberg: Parquet Vectorized V2 reads fails with NPE
 Key: HIVE-26885
 URL: https://issues.apache.org/jira/browse/HIVE-26885
 Project: Hive
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


In case the Iceberg Parquet table lands up having an empty batch, in that case 
while fetching the row number, used for filtering leads to NPE.

The row number to block mapping is only done if the parquetSplit isn't null, so 
in that case, here:
{code:java}
if (parquetInputSplit != null) {
  initialize(parquetInputSplit, conf);
} {code}
row numbers aren't initialised, so we should skip fetching the row numbers later



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26884) Iceberg: V2 Vectorization returns wrong results with deletes

2022-12-21 Thread Ayush Saxena (Jira)

Ayush Saxena created HIVE-26884:
---

 Summary: Iceberg: V2 Vectorization returns wrong results with 
deletes
 Key: HIVE-26884
 URL: https://issues.apache.org/jira/browse/HIVE-26884
 Project: Hive
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


In case of Iceberg V2 reads, if we have delete files, and a couple of parquet 
blocks are skipped in that case the row number calculation is screwed and that 
leads to mismatch with delete filter row positions and hence leading to wrong 
results.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: Lock branch-3 in order for PR build to run successfully.

2022-12-21 Thread Stamatis Zampetakis

Hello,

I don't believe a lock is necessary. I think that people with write access
to the repository already know the processes and how to behave.
If someone decides to push a commit to the repo without running pre-commit
tests there should be a good reason to do so.
I am hoping that circumventing the usual workflow should be a rather rare
event.

Best,
Stamatis

On Tue, Dec 20, 2022 at 8:50 AM Aman Raj 
wrote:

> Hi community,
>
> I see a couple of commits that went in directly to branch-3 before setting
> up the Jenkins pipeline for branch-3. To prevent this, can we lock the
> branch-3 of Hive in order to provide PR's the only way to merge commits in
> branch-3.
>
> Can someone help me in locking branch-3 so that we have a clean release
> process. I do not have the access to do it.
>
> Thanks,
> Aman.
> 
> From: Aman Raj 
> Sent: Friday, December 9, 2022 9:33 AM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> pipeline
>
> Thanks Pravin for your support. Can someone please help me merge this PR
> to branch-3 HIVE-26816 : Add Jenkins file for branch-3 by amanraj2520 ·
> Pull Request #3841 · apache/hive (github.com)<
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Fpull%2F3841=05%7C01%7Crajaman%40microsoft.com%7C94c1ac2c4ddd40437b5f08dad99a6017%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638061554335365489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=D7CwaRcRaQ5ubjz3Ki95HkyclN2a%2BBZ7lvTddDQpTLY%3D=0>.
> I do not have access to do that. Then we will start development on it.
>
> Thanks,
> Aman.
>
>
> 
> From: Pravin Sinha 
> Sent: Friday, December 9, 2022 1:55 AM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> pipeline
>
> [You don't often get email from mailpravi...@gmail.com. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi Aman,
>  I also think that we can merge the PR to enable the test pipeline if the
> change looks fine and subsequently we can fix the tests to bring it to
> green state (hopefully by cherry picking a few commits from branch-3.1
> which is already in green state) . Looks like currently the tests are
> broken in branch-3.
>
> Thanks,
> Pravin
>
> On Thu, Dec 8, 2022 at 3:59 PM Aman Raj 
> wrote:
>
> > Hi team,
> >
> > For the addition of Jenkins file for branch-3, branch-3 has some existing
> > tests failing which was because Jenkins was not running on branch-3. We
> are
> > planning to merge this Jenkins file irrespective of this PR having test
> > failures, since this does not change the code. We will create separate
> > tasks for ensuring that branch-3 has a green build.
> >
> > Link to the PR :
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Fpull%2F3841=05%7C01%7Crajaman%40microsoft.com%7C94c1ac2c4ddd40437b5f08dad99a6017%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638061554335365489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=D7CwaRcRaQ5ubjz3Ki95HkyclN2a%2BBZ7lvTddDQpTLY%3D=0
> >
> > Fyi, branch-3.1 has a green build.
> >
> > Thanks,
> > Aman.
> > 
> > From: Aman Raj 
> > Sent: Wednesday, December 7, 2022 3:19 PM
> > To: dev@hive.apache.org 
> > Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> > pipeline
> >
> > Hi Ayush,
> >
> > Thanks for clarifying. Will wait for it to turn green.
> >
> > Thanks,
> > Aman.
> > 
> > From: Ayush Saxena 
> > Sent: Wednesday, December 7, 2022 3:11 PM
> > To: dev@hive.apache.org 
> > Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> > pipeline
> >
> > Hi Aman,
> > The build is already running for your PR:
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fci.hive.apache.org%2Fblue%2Forganizations%2Fjenkins%2Fhive-precommit%2Fdetail%2FPR-3841%2F1%2Fpipeline=05%7C01%7Crajaman%40microsoft.com%7C94c1ac2c4ddd40437b5f08dad99a6017%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638061554335365489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=osIglpTld3PtOPyFhGBLTSU9Ku1FWngPMofNQXILpyM%3D=0
> >
> > The JenkinsFile is picked from the PR while running rather than the
> target
> > branch.
> >
> > -Ayush
> >
> > > On 07-Dec-2022, at 3:03 PM, Aman Raj 
> > wrote:
> > >
> > > Hi Stamatis,
> > >
> > > How can we ensure that unless the PR is merged. Please suggest.
> > > I was thinking of merging this and raising a sample PR on branch-3 to
> > check whether it works or not. Is there some other way?
> > >
> > > Thanks,
> > > Aman.
> > > 
> > > From: Stamatis Zampetakis 
> > > Sent: Wednesday, December 7, 2022 2:51 PM
> > > To:

Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-21 Thread Pravin Sinha

Congratulations, Ayush ! Well deserved.

-Pravin

On Wed, Dec 21, 2022 at 10:18 AM Kirti Ruge  wrote:

> Congratulations Ayush.
>
> On Wed, 21 Dec 2022 at 12:15 AM, Chris Nauroth 
> wrote:
>
>> Congratulations, Ayush!
>>
>> Chris Nauroth
>>
>>
>> On Tue, Dec 20, 2022 at 10:02 AM Sai Hemanth Gantasala <
>> saihema...@cloudera.com> wrote:
>>
>> > Congratulations Ayush, Very well deserved!!.
>> >
>> > On Mon, Dec 19, 2022 at 5:12 PM Naveen Gangam 
>> > wrote:
>> >
>> >> Hello Hive Community,
>> >> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted
>> the
>> >> Apache Hive PMC's invitation to become PMC Member, and is now our
>> newest
>> >> PMC member. Many thanks to Ayush for all the contributions he has made
>> and
>> >> looking forward to many more future contributions in the expanded role.
>> >>
>> >> Please join me in congratulating Ayush !!!
>> >>
>> >> Cheers,
>> >> Naveen (on behalf of Hive PMC)
>> >>
>> >>
>> >
>>
>

[jira] [Created] (HIVE-26883) Revert "HIVE-21872 Bucketed tables that load data from data/files/auto_sortmerge_join should be tagged as 'bucketing_version'='1'"

2022-12-21 Thread Aman Raj (Jira)

Aman Raj created HIVE-26883:
---

 Summary: Revert "HIVE-21872 Bucketed tables that load data from 
data/files/auto_sortmerge_join should be tagged as 'bucketing_version'='1'"
 Key: HIVE-26883
 URL: https://issues.apache.org/jira/browse/HIVE-26883
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


Tests are failing with the following errors:
Client Execution succeeded but contained differences (error code = 1) after 
executing auto_sortmerge_join_12.q 
3d2
< TBLPROPERTIES('bucketing_version'='1')
9d7
< TBLPROPERTIES('bucketing_version'='1')
31d28
< TBLPROPERTIES('bucketing_version'='1')
36d32
< TBLPROPERTIES('bucketing_version'='1')
108d103
< TBLPROPERTIES('bucketing_version'='1')
114d108
< TBLPROPERTIES('bucketing_version'='1')
249c243
< bucketing_version 1
---
> bucketing_version 2
299c293
< bucketing_version 1
---
> bucketing_version 2
378c372
< bucketing_version 1
---
> bucketing_version 2
456c450
< bucketing_version 1
---
> bucketing_version 2
 
 
Since this commit was already reverted in branch-3.1 we can revert this in 
branch-3 to make it 3.1 compatible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2022-12-21 Thread Peter Vary (Jira)

Peter Vary created HIVE-26882:
-

 Summary: Allow transactional check of Table parameter before 
altering the Table
 Key: HIVE-26882
 URL: https://issues.apache.org/jira/browse/HIVE-26882
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Peter Vary


We should add the possibility to transactionally check if a Table parameter is 
changed before altering the table in the HMS.

This would provide an alternative, less error-prone and faster way to commit an 
Iceberg table, as the Iceberg table currently needs to:
- Create an exclusive lock
- Get the table metadata to check if the current snapshot is not changed
- Update the table metadata
- Release the lock

After the change these 4 HMS calls could be substituted with a single alter 
table call.
Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread Kirti Ruge

Congratulations Ayush.

On Wed, 21 Dec 2022 at 12:15 AM, Chris Nauroth  wrote:

> Congratulations, Ayush!
>
> Chris Nauroth
>
>
> On Tue, Dec 20, 2022 at 10:02 AM Sai Hemanth Gantasala <
> saihema...@cloudera.com> wrote:
>
> > Congratulations Ayush, Very well deserved!!.
> >
> > On Mon, Dec 19, 2022 at 5:12 PM Naveen Gangam 
> > wrote:
> >
> >> Hello Hive Community,
> >> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted
> the
> >> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> >> PMC member. Many thanks to Ayush for all the contributions he has made
> and
> >> looking forward to many more future contributions in the expanded role.
> >>
> >> Please join me in congratulating Ayush !!!
> >>
> >> Cheers,
> >> Naveen (on behalf of Hive PMC)
> >>
> >>
> >
>

[jira] [Created] (HIVE-26881) tblproperties are not cleaned from jobConf for TezTask

2022-12-20 Thread Yi Zhang (Jira)

Yi Zhang created HIVE-26881:
---

 Summary: tblproperties are not cleaned from jobConf for TezTask
 Key: HIVE-26881
 URL: https://issues.apache.org/jira/browse/HIVE-26881
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.3
Reporter: Yi Zhang


Users uses xmlserde to read xml files, when user create table they specify 
tblproperties and it uses jobConf 

[https://github.com/VenkataNU/hivexmlserde/blob/main/src/main/java/com/ibm/spss/hive/serde2/xml/XmlInputFormat.java#L72]

in Tez mode, after 1st query run, those tblproperties are part of Driver.conf, 
so then next query runs, the jobConf of the tez job contains the tblproperties 
from last run. This is cause wrong results since the next query should not use 
last query's tblproperties.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26880) Upgrade Apache Directory Server to 1.5.7 for release 3.2.

2022-12-20 Thread Chris Nauroth (Jira)

Chris Nauroth created HIVE-26880:


 Summary: Upgrade Apache Directory Server to 1.5.7 for release 3.2.
 Key: HIVE-26880
 URL: https://issues.apache.org/jira/browse/HIVE-26880
 Project: Hive
  Issue Type: Improvement
  Components: Test
Reporter: Chris Nauroth
Assignee: Chris Nauroth


branch-3 uses Apache Directory Server in some tests. It currently uses version 
1.5.6. This version has a transitive dependency to a SNAPSHOT, making it 
awkward to build and release. We can upgrade to 1.5.7 to remove the SNAPSHOT 
dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26879) Backport of HIVE-23323 Add qsplits profile

2022-12-20 Thread Aman Raj (Jira)

Aman Raj created HIVE-26879:
---

 Summary: Backport of HIVE-23323 Add qsplits profile
 Key: HIVE-26879
 URL: https://issues.apache.org/jira/browse/HIVE-26879
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread Chris Nauroth

Congratulations, Ayush!

Chris Nauroth


On Tue, Dec 20, 2022 at 10:02 AM Sai Hemanth Gantasala <
saihema...@cloudera.com> wrote:

> Congratulations Ayush, Very well deserved!!.
>
> On Mon, Dec 19, 2022 at 5:12 PM Naveen Gangam 
> wrote:
>
>> Hello Hive Community,
>> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
>> Apache Hive PMC's invitation to become PMC Member, and is now our newest
>> PMC member. Many thanks to Ayush for all the contributions he has made and
>> looking forward to many more future contributions in the expanded role.
>>
>> Please join me in congratulating Ayush !!!
>>
>> Cheers,
>> Naveen (on behalf of Hive PMC)
>>
>>
>

Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread Sai Hemanth Gantasala

Congratulations Ayush, Very well deserved!!.

On Mon, Dec 19, 2022 at 5:12 PM Naveen Gangam  wrote:

> Hello Hive Community,
> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> PMC member. Many thanks to Ayush for all the contributions he has made and
> looking forward to many more future contributions in the expanded role.
>
> Please join me in congratulating Ayush !!!
>
> Cheers,
> Naveen (on behalf of Hive PMC)
>
>

[jira] [Created] (HIVE-26878) Disable TestJdbcWithMiniLlapVectorArrow in branch-3

2022-12-20 Thread Aman Raj (Jira)

Aman Raj created HIVE-26878:
---

 Summary: Disable TestJdbcWithMiniLlapVectorArrow in branch-3
 Key: HIVE-26878
 URL: https://issues.apache.org/jira/browse/HIVE-26878
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


The following issue comes while running the test :
h4. Error
Failed during createSources processLine with code=1
h4. Stacktrace
java.lang.AssertionError: Failed during createSources processLine with code=1
 at org.junit.Assert.fail(Assert.java:88)
 at org.apache.hadoop.hive.ql.QTestUtil.initFromScript(QTestUtil.java:1210)
 at org.apache.hadoop.hive.ql.QTestUtil.createSources(QTestUtil.java:1192)
 at org.apache.hadoop.hive.ql.QTestUtil.createSources(QTestUtil.java:1179)
 at 
org.apache.hadoop.hive.cli.control.CoreCliDriver$3.invokeInternal(CoreCliDriver.java:83)
 at 
org.apache.hadoop.hive.cli.control.CoreCliDriver$3.invokeInternal(CoreCliDriver.java:80)
 at 
org.apache.hadoop.hive.util.ElapsedTimeLoggingWrapper.invoke(ElapsedTimeLoggingWrapper.java:33)
 at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.beforeClass(CoreCliDriver.java:86)
 at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:71)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
 at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
 at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
 at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
 at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
 
This test is ignored in Hive master so backporting the same fix. Commit ID : 
77187b3e



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26877) Parquet CTAS with JOIN on decimals with different precision/scale fail

2022-12-20 Thread Stamatis Zampetakis (Jira)

Stamatis Zampetakis created HIVE-26877:
--

 Summary: Parquet CTAS with JOIN on decimals with different 
precision/scale fail
 Key: HIVE-26877
 URL: https://issues.apache.org/jira/browse/HIVE-26877
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0-alpha-2
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
 Attachments: ctas_parquet_join.q

Creating a Parquet table using CREATE TABLE AS SELECT syntax (CTAS) leads to 
runtime error when the SELECT statement joins columns with different 
precision/scale.

Steps to reproduce:
{code:sql}
CREATE TABLE table_a (col_dec decimal(5,0));
CREATE TABLE table_b(col_dec decimal(38,10));

INSERT INTO table_a VALUES (1);
INSERT INTO table_b VALUES (1.00);

set hive.default.fileformat=parquet;

create table target as
select table_a.col_dec
from table_a
left outer join table_b on
table_a.col_dec = table_b.col_dec;
{code}

Stacktrace:

{noformat}
2022-12-20T07:02:52,237  INFO [2dfbd95a-7553-467b-b9d0-629100785502 Listener at 
0.0.0.0/46609] reexec.ReExecuteLostAMQueryPlugin: Got exception message: Vertex 
failed, vertexName=Reducer 2, vertexId=vertex_1671548565336_0001_3_02, 
diagnostics=[Task failed, taskId=task_1671548565336_0001_3_02_00, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
failure ) : 
attempt_1671548565336_0001_3_02_00_0:java.lang.RuntimeException: 
java.lang.RuntimeException: Hive Runtime Error while closing operators: Fixed 
Binary size 16 does not match field type length 3
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
operators: Fixed Binary size 16 does not match field type length 3
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:379)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:310)
... 15 more
Caused by: java.lang.IllegalArgumentException: Fixed Binary size 16 does not 
match field type length 3
at 
org.apache.parquet.column.values.plain.FixedLenByteArrayPlainValuesWriter.writeBytes(FixedLenByteArrayPlainValuesWriter.java:56)
at 
org.apache.parquet.column.impl.ColumnWriterBase.write(ColumnWriterBase.java:174)
at 
org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.addBinary(MessageColumnIO.java:476)
at 
org.apache.parquet.io.RecordConsumerLoggingWrapper.addBinary(RecordConsumerLoggingWrapper.java:116)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$DecimalDataWriter.write(DataWritableWriter.java:571)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$GroupDataWriter.write(DataWritableWriter.java:228)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter$MessageDataWriter.write(DataWritableWriter.java:251)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:115)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:76)
at 
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:35)
at 
org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:128)
at 
org.apache.parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:182)

Re: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread László Bodor

this is super cool and super very well deserved! :) nowadays, in case of a
changing/evolving/whatever community like Hive it's crucial to have folks
actively contributing, being responsive on the mailing list, etc...Ayush is
an example to be followed, congratulations !!

Butao Zhang  ezt írta (időpont: 2022. dec. 20., K,
11:54):

> Congratulations Ayush !
>
>
>  Replied Message 
> | From | Jan Fili |
> | Date | 12/20/2022 18:38 |
> | To |  |
> | Subject | Re: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena |
> gratz!
>
> Am Di., 20. Dez. 2022 um 11:09 Uhr schrieb Aasha <
> aasha.medhi2...@gmail.com>:
>
> Congratulations Ayush. Very well deserved
>
> On 20-Dec-2022, at 3:36 PM, Pau Tallada  wrote:
>
> 
> Congratulations!
>
> Missatge de Mayank Singh  del dia dt., 20 de des.
> 2022 a les 9:01:
>
> Congratulations Ayush 拾
>
> On Tue, 20 Dec, 2022, 1:19 pm Stamatis Zampetakis, 
> wrote:
>
> Congrats Ayush! Very well deserved!
>
> Thanks for all the hard work that you are putting for the project and
> always being there when people ask for help.
>
> Best,
> Stamatis
>
> On Tue, Dec 20, 2022 at 7:51 AM Sankar Hariappan via user <
> u...@hive.apache.org> wrote:
>
> Congrats Ayush!
>
>
>
> Thanks,
>
> Sankar
>
>
>
> From: Simhadri G 
> Sent: Tuesday, December 20, 2022 12:16 PM
> To: u...@hive.apache.org
> Cc: dev ; ayushsax...@apache.org
> Subject: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena
>
>
>
> Congratulations Ayush
>
>
>
> On Tue, 20 Dec 2022, 06:42 Naveen Gangam,  wrote:
>
> Hello Hive Community,
>
> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> PMC member. Many thanks to Ayush for all the contributions he has made and
> looking forward to many more future contributions in the expanded role.
>
>
>
> Please join me in congratulating Ayush !!!
>
>
>
> Cheers,
>
> Naveen (on behalf of Hive PMC)
>
>
>
>
>
> --
> --
> Pau Tallada Crespí
> Departament de Serveis
> Port d'Informació Científica (PIC)
> Tel: +34 93 170 2729
> --
>
>

Re: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread Butao Zhang

Congratulations Ayush !


 Replied Message 
| From | Jan Fili |
| Date | 12/20/2022 18:38 |
| To |  |
| Subject | Re: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena |
gratz!

Am Di., 20. Dez. 2022 um 11:09 Uhr schrieb Aasha :

Congratulations Ayush. Very well deserved

On 20-Dec-2022, at 3:36 PM, Pau Tallada  wrote:


Congratulations!

Missatge de Mayank Singh  del dia dt., 20 de des. 2022 a 
les 9:01:

Congratulations Ayush 拾

On Tue, 20 Dec, 2022, 1:19 pm Stamatis Zampetakis,  wrote:

Congrats Ayush! Very well deserved!

Thanks for all the hard work that you are putting for the project and always 
being there when people ask for help.

Best,
Stamatis

On Tue, Dec 20, 2022 at 7:51 AM Sankar Hariappan via user 
 wrote:

Congrats Ayush!



Thanks,

Sankar



From: Simhadri G 
Sent: Tuesday, December 20, 2022 12:16 PM
To: u...@hive.apache.org
Cc: dev ; ayushsax...@apache.org
Subject: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena



Congratulations Ayush



On Tue, 20 Dec 2022, 06:42 Naveen Gangam,  wrote:

Hello Hive Community,

Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the 
Apache Hive PMC's invitation to become PMC Member, and is now our newest PMC 
member. Many thanks to Ayush for all the contributions he has made and looking 
forward to many more future contributions in the expanded role.



Please join me in congratulating Ayush !!!



Cheers,

Naveen (on behalf of Hive PMC)





--
--
Pau Tallada Crespí
Departament de Serveis
Port d'Informació Científica (PIC)
Tel: +34 93 170 2729
--

Re: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-20 Thread Mayank Singh

Congratulations Ayush 拾

On Tue, 20 Dec, 2022, 1:19 pm Stamatis Zampetakis, 
wrote:

> Congrats Ayush! Very well deserved!
>
> Thanks for all the hard work that you are putting for the project and
> always being there when people ask for help.
>
> Best,
> Stamatis
>
> On Tue, Dec 20, 2022 at 7:51 AM Sankar Hariappan via user <
> u...@hive.apache.org> wrote:
>
>> Congrats Ayush!
>>
>>
>>
>> Thanks,
>>
>> Sankar
>>
>>
>>
>> *From:* Simhadri G 
>> *Sent:* Tuesday, December 20, 2022 12:16 PM
>> *To:* u...@hive.apache.org
>> *Cc:* dev ; ayushsax...@apache.org
>> *Subject:* [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena
>>
>>
>>
>> Congratulations Ayush
>>
>>
>>
>> On Tue, 20 Dec 2022, 06:42 Naveen Gangam,  wrote:
>>
>> Hello Hive Community,
>>
>> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
>> Apache Hive PMC's invitation to become PMC Member, and is now our newest
>> PMC member. Many thanks to Ayush for all the contributions he has made and
>> looking forward to many more future contributions in the expanded role.
>>
>>
>>
>> Please join me in congratulating Ayush !!!
>>
>>
>>
>> Cheers,
>>
>> Naveen (on behalf of Hive PMC)
>>
>>
>>
>>

Re: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-19 Thread Stamatis Zampetakis

Congrats Ayush! Very well deserved!

Thanks for all the hard work that you are putting for the project and
always being there when people ask for help.

Best,
Stamatis

On Tue, Dec 20, 2022 at 7:51 AM Sankar Hariappan via user <
u...@hive.apache.org> wrote:

> Congrats Ayush!
>
>
>
> Thanks,
>
> Sankar
>
>
>
> *From:* Simhadri G 
> *Sent:* Tuesday, December 20, 2022 12:16 PM
> *To:* u...@hive.apache.org
> *Cc:* dev ; ayushsax...@apache.org
> *Subject:* [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena
>
>
>
> Congratulations Ayush
>
>
>
> On Tue, 20 Dec 2022, 06:42 Naveen Gangam,  wrote:
>
> Hello Hive Community,
>
> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> PMC member. Many thanks to Ayush for all the contributions he has made and
> looking forward to many more future contributions in the expanded role.
>
>
>
> Please join me in congratulating Ayush !!!
>
>
>
> Cheers,
>
> Naveen (on behalf of Hive PMC)
>
>
>
>

[jira] [Created] (HIVE-26876) Backport of BUG-108109 Disable flaky Spark Tests in branch-3

2022-12-19 Thread Aman Raj (Jira)

Aman Raj created HIVE-26876:
---

 Summary: Backport of BUG-108109 Disable flaky Spark Tests in 
branch-3
 Key: HIVE-26876
 URL: https://issues.apache.org/jira/browse/HIVE-26876
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


BUG-108109 : Disable flaky spark Tests

Change-Id: I3adf35d6b4d1ee64ad79fe418a8cc57a6b9a4e6c

 

Commit Id : 2b578be2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

RE: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-19 Thread Sankar Hariappan

Congrats Ayush!

Thanks,
Sankar

From: Simhadri G 
Sent: Tuesday, December 20, 2022 12:16 PM
To: u...@hive.apache.org
Cc: dev ; ayushsax...@apache.org
Subject: [EXTERNAL] Re: [ANNOUNCE] New PMC Member: Ayush Saxena

Congratulations Ayush

On Tue, 20 Dec 2022, 06:42 Naveen Gangam, 
mailto:ngan...@cloudera.com>> wrote:
Hello Hive Community,
Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the 
Apache Hive PMC's invitation to become PMC Member, and is now our newest PMC 
member. Many thanks to Ayush for all the contributions he has made and looking 
forward to many more future contributions in the expanded role.

Please join me in congratulating Ayush !!!

Cheers,
Naveen (on behalf of Hive PMC)

Lock branch-3 in order for PR build to run successfully.

2022-12-19 Thread Aman Raj

Hi community,

I see a couple of commits that went in directly to branch-3 before setting up 
the Jenkins pipeline for branch-3. To prevent this, can we lock the branch-3 of 
Hive in order to provide PR's the only way to merge commits in branch-3.

Can someone help me in locking branch-3 so that we have a clean release 
process. I do not have the access to do it.

Thanks,
Aman.

From: Aman Raj 
Sent: Friday, December 9, 2022 9:33 AM
To: dev@hive.apache.org 
Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0 pipeline

Thanks Pravin for your support. Can someone please help me merge this PR to 
branch-3 HIVE-26816 : Add Jenkins file for branch-3 by amanraj2520 · Pull 
Request #3841 · apache/hive 
(github.com).
 I do not have access to do that. Then we will start development on it.

Thanks,
Aman.



From: Pravin Sinha 
Sent: Friday, December 9, 2022 1:55 AM
To: dev@hive.apache.org 
Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0 pipeline

[You don't often get email from mailpravi...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Hi Aman,
 I also think that we can merge the PR to enable the test pipeline if the
change looks fine and subsequently we can fix the tests to bring it to
green state (hopefully by cherry picking a few commits from branch-3.1
which is already in green state) . Looks like currently the tests are
broken in branch-3.

Thanks,
Pravin

On Thu, Dec 8, 2022 at 3:59 PM Aman Raj 
wrote:

> Hi team,
>
> For the addition of Jenkins file for branch-3, branch-3 has some existing
> tests failing which was because Jenkins was not running on branch-3. We are
> planning to merge this Jenkins file irrespective of this PR having test
> failures, since this does not change the code. We will create separate
> tasks for ensuring that branch-3 has a green build.
>
> Link to the PR : 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhive%2Fpull%2F3841=05%7C01%7Crajaman%40microsoft.com%7C94c1ac2c4ddd40437b5f08dad99a6017%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638061554335365489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=D7CwaRcRaQ5ubjz3Ki95HkyclN2a%2BBZ7lvTddDQpTLY%3D=0
>
> Fyi, branch-3.1 has a green build.
>
> Thanks,
> Aman.
> 
> From: Aman Raj 
> Sent: Wednesday, December 7, 2022 3:19 PM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> pipeline
>
> Hi Ayush,
>
> Thanks for clarifying. Will wait for it to turn green.
>
> Thanks,
> Aman.
> 
> From: Ayush Saxena 
> Sent: Wednesday, December 7, 2022 3:11 PM
> To: dev@hive.apache.org 
> Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> pipeline
>
> Hi Aman,
> The build is already running for your PR:
>
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fci.hive.apache.org%2Fblue%2Forganizations%2Fjenkins%2Fhive-precommit%2Fdetail%2FPR-3841%2F1%2Fpipeline=05%7C01%7Crajaman%40microsoft.com%7C94c1ac2c4ddd40437b5f08dad99a6017%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638061554335365489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=osIglpTld3PtOPyFhGBLTSU9Ku1FWngPMofNQXILpyM%3D=0
>
> The JenkinsFile is picked from the PR while running rather than the target
> branch.
>
> -Ayush
>
> > On 07-Dec-2022, at 3:03 PM, Aman Raj 
> wrote:
> >
> > Hi Stamatis,
> >
> > How can we ensure that unless the PR is merged. Please suggest.
> > I was thinking of merging this and raising a sample PR on branch-3 to
> check whether it works or not. Is there some other way?
> >
> > Thanks,
> > Aman.
> > 
> > From: Stamatis Zampetakis 
> > Sent: Wednesday, December 7, 2022 2:51 PM
> > To: dev@hive.apache.org 
> > Subject: Re: [EXTERNAL] Re: Sync of Branch-3 & Branch-3.1 for 3.2.0
> pipeline
> >
> > Hey Aman,
> >
> > Before checking in the PR we should ensure that it works as expected;
> i.e.,
> > having a green run in a reasonable time.
> >
> > Best,
> > Stamatis
> >
> >> On Wed, Dec 7, 2022 at 9:29 AM Aman Raj 
> >> wrote:
> >>
> >> Hi Stamatis,
> >>
> >> I have raised a Pull Request for the same -
> >>
>

Re: [ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-19 Thread Simhadri G

Congratulations Ayush

On Tue, 20 Dec 2022, 06:42 Naveen Gangam,  wrote:

> Hello Hive Community,
> Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
> Apache Hive PMC's invitation to become PMC Member, and is now our newest
> PMC member. Many thanks to Ayush for all the contributions he has made and
> looking forward to many more future contributions in the expanded role.
>
> Please join me in congratulating Ayush !!!
>
> Cheers,
> Naveen (on behalf of Hive PMC)
>
>

[ANNOUNCE] New PMC Member: Ayush Saxena

2022-12-19 Thread Naveen Gangam

Hello Hive Community,
Apache Hive PMC is pleased to announce that Ayush Saxena has accepted the
Apache Hive PMC's invitation to become PMC Member, and is now our newest
PMC member. Many thanks to Ayush for all the contributions he has made and
looking forward to many more future contributions in the expanded role.

Please join me in congratulating Ayush !!!

Cheers,
Naveen (on behalf of Hive PMC)

[jira] [Created] (HIVE-26875) Transaction conflict retry loop only executes once

2022-12-19 Thread John Sherman (Jira)

John Sherman created HIVE-26875:
---

 Summary: Transaction conflict retry loop only executes once
 Key: HIVE-26875
 URL: https://issues.apache.org/jira/browse/HIVE-26875
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: John Sherman
Assignee: John Sherman


Currently the "conflict retry loop" only executes once.
[https://github.com/apache/hive/blob/ab4c53de82d4aaa33706510441167f2df55df15e/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L264]

The intent of this loop is to detect if a conflicting transaction has committed 
while we were waiting to acquire locks. If there is a conflicting transaction, 
it invalidates the snapshot, rolls-back the transaction, opens a new 
transaction and tries to re-acquire locks (and then recompile). It then checks 
again if a conflicting transaction has committed and if so, redoes the above 
steps again, up to HIVE_TXN_MAX_RETRYSNAPSHOT_COUNT times.

However - isValidTxnState relies on getNonSharedLockedTable():
[https://github.com/apache/hive/blob/ab4c53de82d4aaa33706510441167f2df55df15e/ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java#L422]
which does:
{code:java}
  private Set getNonSharedLockedTables() {
if (CollectionUtils.isEmpty(driver.getContext().getHiveLocks())) {
  return Collections.emptySet(); // Nothing to check
}{code}
getHiveLocks gets populated by lockAndRespond... HOWEVER -
compileInternal ends up calling compile which ends up calling preparForCompile 
which ends up calling prepareContext which ends up destroying the context with 
the information lockAndRespond populated. So when the loop executes after all 
of this, it will never detect a 2nd conflict because isValidTxnState will 
always return true (because it thinks there are no locked objects).

This manifests as duplicate records being created during concurrent UPDATEs if 
a transaction get conflicted twice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26874) Iceberg: Positional delete files are not cached

2022-12-19 Thread Rajesh Balamohan (Jira)

Rajesh Balamohan created HIVE-26874:
---

 Summary: Iceberg: Positional delete files are not cached 
 Key: HIVE-26874
 URL: https://issues.apache.org/jira/browse/HIVE-26874
 Project: Hive
  Issue Type: Improvement
Reporter: Rajesh Balamohan


With iceberg v2 (MOR mode), "positional delete" files are not cached causing 
runtime delays.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26873) Whiltelist iceberg configs for sql std authorization

2022-12-19 Thread Denys Kuzmenko (Jira)

Denys Kuzmenko created HIVE-26873:
-

 Summary: Whiltelist iceberg configs for sql std authorization 
 Key: HIVE-26873
 URL: https://issues.apache.org/jira/browse/HIVE-26873
 Project: Hive
  Issue Type: Task
Reporter: Denys Kuzmenko






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26872) INSERT UNION with LATERAL VIEW does not produce result

2022-12-19 Thread FangBO (Jira)

FangBO created HIVE-26872:
-

 Summary: INSERT UNION with LATERAL  VIEW does not produce result
 Key: HIVE-26872
 URL: https://issues.apache.org/jira/browse/HIVE-26872
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 3.1.2, 2.3.9
Reporter: FangBO


{code:java}
// code placeholder

DROP TABLE union_test;
CREATE TABLE union_test(id INT) PARTITIONED BY (`dt` STRING);

DROP TABLE json_src;
CREATE TABLE json_src(message STRING) PARTITIONED BY (`dt` STRING);

INSERT OVERWRITE TABLE json_src PARTITION(dt='1219') VALUES('{"id":1}');
INSERT OVERWRITE TABLE json_src PARTITION(dt='1220') VALUES('{"id":2}');


INSERT OVERWRITE TABLE union_test PARTITION (dt='1221')
SELECT id FROM json_src LATERAL  VIEW json_tuple(message, 'id') b AS id WHERE 
dt='1219'
UNION ALL
SELECT id FROM json_src LATERAL  VIEW json_tuple(message, 'id') b AS id WHERE 
dt='1220'
; {code}
The script above does not produce data in partition dt='1221'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26871) TestCrudCompactorOnTez is flaky after HIVE-26479

2022-12-18 Thread Sourabh Badhya (Jira)

Sourabh Badhya created HIVE-26871:
-

 Summary: TestCrudCompactorOnTez is flaky after HIVE-26479
 Key: HIVE-26871
 URL: https://issues.apache.org/jira/browse/HIVE-26871
 Project: Hive
  Issue Type: Bug
  Components: Test
Reporter: Sourabh Badhya
Assignee: Sourabh Badhya


The 3 tests in TestCrudCompactorOnTez which use the ProtoLoggingHook run at 
different times. Unfortunately, the 3 tests are run at the following times as 
described in the logs - 
Test 1 - 
{code:java}
2022-12-15T16:57:44,294  INFO [main] compactor.TestCrudCompactorOnTez: Current 
time: 2022-12-15T23:57:44 {code}
Test 2 - 
{code:java}
2022-12-15T17:00:32,452  INFO [main] compactor.TestCrudCompactorOnTez: Current 
time: 2022-12-16T00:00:32 {code}
Test 3 - 
{code:java}
2022-12-15T17:04:12,895  INFO [main] compactor.TestCrudCompactorOnTez: Current 
time: 2022-12-16T00:04:12 {code}
As we can see, the tests are run on 2 different dates. Therefore, 
HiveProtoLoggingHook generates a unique event logs for every unique date. This 
is the behaviour of HiveProtoLoggingHook.
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/HiveProtoLoggingHook.java#L296-L310]

However the expectation from the test side, while generating the log readers is 
that there must be a single file in the log folder defined.
[https://github.com/apache/hive/blob/master/ql/src/test/org/apache/hadoop/hive/ql/hooks/TestHiveProtoLoggingHook.java#L310]
 

Unfortunately, since there are 2 files which are generated (as mentioned in the 
logs as well), the following tests fail - 
{code:java}
2022-12-15T17:04:14,837  INFO [main] hooks.TestHiveProtoLoggingHook: List of 
paths: 
2022-12-15T17:04:14,837  INFO [main] hooks.TestHiveProtoLoggingHook: 
file:/home/jenkins/agent/workspace/internal-hive-flaky-check/itests/hive-unit/target/tmp/junit441259831997042392/junit3438435196942546140/date=2022-12-15
2022-12-15T17:04:14,837  INFO [main] hooks.TestHiveProtoLoggingHook: 
file:/home/jenkins/agent/workspace/internal-hive-flaky-check/itests/hive-unit/target/tmp/junit441259831997042392/junit3438435196942546140/date=2022-12-16
 {code}
The solution is to make _getTestReader()_ in _TestHiveProtoLoggingHook_ more 
compatible with multiple event log file scenario and be able to generate 
multiple readers for all files present in the folder instead of fixating on a 
single file clause.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26870) Backport of HIVE-25266: Fix TestWarehouseExternalDir

2022-12-18 Thread Aman Raj (Jira)

Aman Raj created HIVE-26870:
---

 Summary: Backport of HIVE-25266: Fix TestWarehouseExternalDir
 Key: HIVE-26870
 URL: https://issues.apache.org/jira/browse/HIVE-26870
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26869) Backport of HIVE-25250 Branch-3: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation

2022-12-18 Thread Aman Raj (Jira)

Aman Raj created HIVE-26869:
---

 Summary: Backport of HIVE-25250 Branch-3: Fix 
TestHS2ImpersonationWithRemoteMS.testImpersonation
 Key: HIVE-26869
 URL: https://issues.apache.org/jira/browse/HIVE-26869
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26868) Iceberg: Allow IOW on empty table with Partition Evolution

2022-12-17 Thread Ayush Saxena (Jira)

Ayush Saxena created HIVE-26868:
---

 Summary: Iceberg: Allow IOW on empty table with Partition Evolution
 Key: HIVE-26868
 URL: https://issues.apache.org/jira/browse/HIVE-26868
 Project: Hive
  Issue Type: Improvement
Reporter: Ayush Saxena
Assignee: Ayush Saxena


In case an iceberg table has gone through partition evolution, we don't allow 
an IOW operation on it.

But if it is empty, we can allow an IOW since there ain't any data which can 
get messed by overwrite.

This helps to compact data, & merge the delete files into data file

via

Truncate -> IOW with Snapshot ID before Truncate.

Same flow is used by Impala for compacting Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26867) Backport of HIVE-24816

2022-12-17 Thread Aman Raj (Jira)

Aman Raj created HIVE-26867:
---

 Summary: Backport of HIVE-24816
 Key: HIVE-26867
 URL: https://issues.apache.org/jira/browse/HIVE-26867
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


This commit is needed for upgrading Jackson version and also fixed flaky tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26866) Fix TestOperationLoggingLayout test failures in branch-3

2022-12-16 Thread Aman Raj (Jira)

Aman Raj created HIVE-26866:
---

 Summary: Fix TestOperationLoggingLayout test failures in branch-3
 Key: HIVE-26866
 URL: https://issues.apache.org/jira/browse/HIVE-26866
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


The tests are failing due to the following error :
h4. Error
appenders
h4. Stacktrace
java.lang.NoSuchFieldException: appenders
 at java.lang.Class.getDeclaredField(Class.java:2070)
 at 
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.checkAppenderState(TestOperationLoggingLayout.java:191)
 at 
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testHushableRandomAccessFileAppender(TestOperationLoggingLayout.java:167)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
 at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
 at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
 at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
 at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26865) Fix TestSQL11ReservedKeyWordsNegative test in branch-3

2022-12-16 Thread Aman Raj (Jira)

Aman Raj created HIVE-26865:
---

 Summary: Fix TestSQL11ReservedKeyWordsNegative test in branch-3
 Key: HIVE-26865
 URL: https://issues.apache.org/jira/browse/HIVE-26865
 Project: Hive
  Issue Type: Sub-task
Reporter: Aman Raj
Assignee: Aman Raj


Due to [HIVE-21293] Fix ambiguity in grammar warnings at compilation time (II) 
- ASF JIRA (apache.org), the test cases are failing with the following error :

java.lang.AssertionError: Expected ParseException
        at org.junit.Assert.fail(Assert.java:88)
        at 
org.apache.hadoop.hive.ql.parse.TestSQL11ReservedKeyWordsNegative$TestSQL11ReservedKeyWordsNegativeParametrized.testNegative(TestSQL11ReservedKeyWordsNegative.java:105)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at org.junit.runners.Suite.runChild(Suite.java:127)
        at org.junit.runners.Suite.runChild(Suite.java:26)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at org.junit.runners.Suite.runChild(Suite.java:127)
        at org.junit.runners.Suite.runChild(Suite.java:26)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
        at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
        at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26864) Incremental rebuild of non-transaction materialized view fails

2022-12-16 Thread Krisztian Kasa (Jira)

Krisztian Kasa created HIVE-26864:
-

 Summary: Incremental rebuild of non-transaction materialized view 
fails
 Key: HIVE-26864
 URL: https://issues.apache.org/jira/browse/HIVE-26864
 Project: Hive
  Issue Type: Bug
  Components: CBO, Materialized views
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa


{code}
create table t1 (a int, b int) stored as orc TBLPROPERTIES 
('transactional'='true');

insert into t1 values (1,1), (2,1), (3,3);

create materialized view mv1 as
select a, b from t1 where b = 1;

delete from t1 where a = 2;

explain
alter materialized view mv1 rebuild;
{code}

{code}
org.apache.hadoop.hive.ql.parse.SemanticException: Attempt to do update or 
delete on table mv1 that is not transactional
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2400)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2176)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:2168)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:630)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12790)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:464)
at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer.analyzeInternal(AlterMaterializedViewRebuildAnalyzer.java:132)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:326)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:326)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:106)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:522)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:474)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:439)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:433)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:727)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:697)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:114)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at

[jira] [Created] (HIVE-26863) Fix test failures in branch-3.1 related to Arrow because of Jackson Version Upgrade

2022-12-15 Thread Raghav Aggarwal (Jira)

Raghav Aggarwal created HIVE-26863:
--

 Summary: Fix test failures in branch-3.1 related to Arrow because 
of Jackson Version Upgrade 
 Key: HIVE-26863
 URL: https://issues.apache.org/jira/browse/HIVE-26863
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.3
Reporter: Raghav Aggarwal
Assignee: Raghav Aggarwal


Because of the _*Jackson*_ version upgrade to {_}2.12.0{_}, there are unit test 
failure which are related to {_}*arrow*{_}.

 

*Stack Trace:*

 
{code:java}
[ERROR] 
testMapDTI(org.apache.hadoop.hive.ql.io.arrow.TestArrowColumnarBatchSerDe)  
Time elapsed: 0.037 s  <<< ERROR!
java.lang.IllegalStateException: Cannot serialize array list to JSON string
at 
org.apache.arrow.vector.util.JsonStringArrayList.toString(JsonStringArrayList.java:47)
at java.lang.String.valueOf(String.java:2994)
at java.lang.StringBuilder.append(StringBuilder.java:137)
at 
org.apache.arrow.vector.VectorSchemaRoot.printRow(VectorSchemaRoot.java:128)
at 
org.apache.arrow.vector.VectorSchemaRoot.contentToTSVString(VectorSchemaRoot.java:145)
at 
org.apache.hadoop.hive.ql.io.arrow.TestArrowColumnarBatchSerDe.serializeAndDeserialize(TestArrowColumnarBatchSerDe.java:242)
at 
org.apache.hadoop.hive.ql.io.arrow.TestArrowColumnarBatchSerDe.initAndSerializeAndDeserialize(TestArrowColumnarBatchSerDe.java:204)
at 
org.apache.hadoop.hive.ql.io.arrow.TestArrowColumnarBatchSerDe.testMapDTI(TestArrowColumnarBatchSerDe.java:750)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Joda 
date/time type `org.joda.time.Period` not supported by default: add Module 
"com.fasterxml.jackson.datatype:jackson-datatype-joda" to enable handling 
(through reference chain: 
org.apache.arrow.vector.util.JsonStringArrayList[0]->org.apache.arrow.vector.util.JsonStringHashMap["values"])
at 
com.fasterxml.jackson.databind.exc.InvalidDefinitionException.from(InvalidDefinitionException.java:77)
at 
com.fasterxml.jackson.databind.SerializerProvider.reportBadDefinition(SerializerProvider.java:1276)
at 
com.fasterxml.jackson.databind.ser.impl.UnsupportedTypeSerializer.serialize(UnsupportedTypeSerializer.java:35)
at 
com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:808)
at 
com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeWithoutTypeInfo(MapSerializer.java:764)
at 
com.fasterxml.jackson.databind.ser.std.MapSerializer.serialize(MapSerializer.java:720)
at

[jira] [Created] (HIVE-26862) IndexOutOfBoundsException occurred in stats task during dynamic partition table load when user data for partition column is case sensitive. And few rows are missed in the

2022-12-15 Thread Venugopal Reddy K (Jira)

Venugopal Reddy K created HIVE-26862:


 Summary: IndexOutOfBoundsException occurred in stats task during 
dynamic partition table load when user data for partition column is case 
sensitive. And few rows are missed in the partition as well.
 Key: HIVE-26862
 URL: https://issues.apache.org/jira/browse/HIVE-26862
 Project: Hive
  Issue Type: Bug
Reporter: Venugopal Reddy K
 Attachments: data, hive.log

*[Description]* 

java.lang.IndexOutOfBoundsException occurred in stats task during dynamic 
partition table load. This happens when user data for partition column is case 
sensitive. And few rows are missed in the partition as well.

 

 

*[Steps to reproduce]*

1. Create stage table, load some data into stage table, create partition table 
and load data into that table from the stage table. data file is attached below.
{code:java}
0: jdbc:hive2://localhost:1> create database mydb; 0: 
jdbc:hive2://localhost:1> use mydb;
{code}
{code:java}
0: jdbc:hive2://localhost:1> create table stage(num int, name string, 
category string) row format delimited fields terminated by ',' stored as 
textfile;
{code}
{code:java}
0: jdbc:hive2://localhost:1> load data local inpath 'data' into table 
stage;{code}
 
{code:java}
0: jdbc:hive2://localhost:1> select * from stage;
++-+---+
| stage.num  | stage.name  | stage.category|
++-+---+
| 1          | apple       | Fruit         |
| 2          | banana      | Fruit         |
| 3          | carrot      | vegetable     |
| 4          | cherry      | Fruit         |
| 5          | potato      | vegetable     |
| 6          | mango       | Fruit         |
| 7          | tomato      | Vegetable     |=>V in vegetable is uppercase here
++-+---+
7 rows selected (12.979 seconds)
{code}
 

 
{code:java}
0: jdbc:hive2://localhost:1> create table dynpart(num int, name string) 
partitioned by (category string) row format delimited fields terminated by ',' 
stored as textfile;{code}
 

 
{code:java}
0: jdbc:hive2://localhost:1> insert into dynpart select * from stage;
INFO  : Compiling 
command(queryId=kvenureddy_20221215192112_ae2e55b5-6b1f-402d-b79f-874261a27b72):
 insert into dynpart select * from stage
INFO  : No Stats for mydb@stage, Columns: num, name, category
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:stage.num, 
type:int, comment:null), FieldSchema(name:stage.name, type:string, 
comment:null), FieldSchema(name:stage.category, type:string, comment:null)], 
properties:null)
INFO  : Completed compiling 
command(queryId=kvenureddy_20221215192112_ae2e55b5-6b1f-402d-b79f-874261a27b72);
 Time taken: 2.967 seconds
INFO  : Operation QUERY obtained 0 locks
INFO  : Executing 
command(queryId=kvenureddy_20221215192112_ae2e55b5-6b1f-402d-b79f-874261a27b72):
 insert into dynpart select * from stage
WARN  : Hive-on-MR is deprecated in Hive 2 and may not be available in the 
future versions. Consider using a different execution engine (i.e. tez) or 
using Hive 1.X releases.
INFO  : Query ID = 
kvenureddy_20221215192112_ae2e55b5-6b1f-402d-b79f-874261a27b72
INFO  : Total jobs = 2
INFO  : Launching Job 1 out of 2
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_local729224564_0001
INFO  : Executing with tokens: []
INFO  : The url to track the job: http://localhost:8080/
INFO  : Job running in-process (local Hadoop)
INFO  : 2022-12-15 19:21:27,285 Stage-1 map = 0%,  reduce = 0%
INFO  : 2022-12-15 19:21:28,321 Stage-1 map = 100%,  reduce = 0%
INFO  : 2022-12-15 19:21:29,359 Stage-1 map = 100%,  reduce = 100%
INFO  : Ended Job = job_local729224564_0001
INFO  : Starting task [Stage-0:MOVE] in serial mode
INFO  : Loading data to table mydb.dynpart partition (category=null) from 
file:/tmp/warehouse/external/mydb.db/dynpart/.hive-staging_hive_2022-12-15_19-21-12_997_3457134057632526413-1/-ext-1
INFO  : 


INFO  :  Time taken to load dynamic partitions: 33.657 seconds
INFO  :  Time taken for adding to write entity : 0.003 seconds
INFO  : Launching Job 2 out of 2
INFO  : Starting task [Stage-3:MAPRED] in serial mode
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set

[jira] [Created] (HIVE-26861) Skewed column table load do not work as expected if the user data for skewed column is not in lowercase.

2022-12-15 Thread Venugopal Reddy K (Jira)

Venugopal Reddy K created HIVE-26861:


 Summary: Skewed column table load do not work as expected if the 
user data for skewed column is not in lowercase.
 Key: HIVE-26861
 URL: https://issues.apache.org/jira/browse/HIVE-26861
 Project: Hive
  Issue Type: Bug
Reporter: Venugopal Reddy K
 Attachments: data

*[Description]*

Skewed table with case sensitive data on skewed column do not work as expected. 
S{color:#172b4d}kewed values are stored in lower case. And it is expecting user 
data also to be in same lower case(i.e.,does case sensitive comparison). 
Otherwise it doesn't work.{color}

*[Steps to reproduce]* 

1. Create stage table, load some data into stage table, create table with a 
skewed column and load data into that table from the stage table. data file is 
attached below.

 
{code:java}
0: jdbc:hive2://localhost:1> create database mydb;
0: jdbc:hive2://localhost:1> use mydb;
{code}
 

 
{code:java}
0: jdbc:hive2://localhost:1> create table stage(num int, name string, 
category string) row format delimited fields terminated by ',' stored as 
textfile;{code}
 

 
{code:java}
0: jdbc:hive2://localhost:1> load data local inpath 'data' into table 
stage;{code}
 

 
{code:java}
0: jdbc:hive2://localhost:1> select * from stage;
++-+-+
| stage.num  | stage.name  | stage.category  |
++-+-+
| 1          | apple       | Fruit           |
| 2          | banana      | Fruit           |
| 3          | carrot      | vegetable       |
| 4          | cherry      | Fruit           |
| 5          | potato      | vegetable       |
| 6          | mango       | Fruit           |
| 7          | tomato      | vegetable       |
++-+-+
7 rows selected (2.688 seconds)
{code}
 
{code:java}
0: jdbc:hive2://localhost:1> create table skew(num int, name string, 
category string) skewed by(category) on ('Fruit','Vegetable') stored as 
directories row format delimited fields terminated by ',' stored as 
textfile;{code}
 

 
{code:java}
0: jdbc:hive2://localhost:1> insert into skew select * from stage;{code}
 

2. Check warehouse directory skew table data. Table was created with {*}skewed 
by(category) on ('Fruit','Vegetable') clause. {color:#de350b}But, 
t{color}{*}{color:#de350b}*{color:#de350b}h{color}ere is no directory created 
for category=fruit.* {color}{color:#172b4d}Data related to category fruit are 
present in HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME directory itself. {color}

{color:#172b4d}Internally skewed values are stored in lower case. And it is 
expecting user data also to be in same lower case(i.e.,does case sensitive 
comparison). {color}{color:#172b4d}Thus, directory for fruit is not 
created.{color}

 
{code:java}
kvenureddy@192 mydb.db % cd skew 
kvenureddy@192 skew % ls
kvenureddy@192 skew % ls
HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME category=vegetable
kvenureddy@192 skew % pwd
/tmp/warehouse/external/mydb.db/skew
kvenureddy@192 skew % cd HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME 
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % ls
00_0
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % cat 00_0 
1,apple,Fruit
2,banana,Fruit
4,cherry,Fruit
6,mango,Fruit
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % cd ../
kvenureddy@192 skew % cd category=vegetable 
kvenureddy@192 category=vegetable % ls
00_0
kvenureddy@192 category=vegetable % cat 00_0 
3,carrot,vegetable
5,potato,vegetable
7,tomato,vegetable
kvenureddy@192 category=vegetable % 
{code}
 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26860) Appropriate rows in HMS datastore tables(SDS, SERDES, SERDE_PARAMS, SKEWED_COL_NAMES, SKEWED_COL_VALUE_LOC_MAP, SKEWED_VALUES) are not deleted upon drop partition table w

2022-12-15 Thread Venugopal Reddy K (Jira)

Venugopal Reddy K created HIVE-26860:


 Summary: Appropriate rows in HMS datastore tables(SDS, SERDES, 
SERDE_PARAMS, SKEWED_COL_NAMES, SKEWED_COL_VALUE_LOC_MAP, SKEWED_VALUES) are 
not deleted upon drop partition table with skewed columns
 Key: HIVE-26860
 URL: https://issues.apache.org/jira/browse/HIVE-26860
 Project: Hive
  Issue Type: Bug
  Components: Hive, Metastore
Reporter: Venugopal Reddy K
 Attachments: image-2022-12-15-14-26-26-131.png, 
image-2022-12-15-14-31-23-854.png, image-2022-12-15-14-32-55-280.png, partdata3

*[Description]*

Appropriate rows in HMS backing datastore tables(SDS, SERDES, SERDE_PARAMS, 
SKEWED_COL_NAMES, SKEWED_COL_VALUE_LOC_MAP, SKEWED_VALUES) are not deleted upon 
drop partition table with skewed columns.

*[Steps to reproduce]*

1. Create stage table, load some data into stage table, create partition table 
with skewed columns and load data into that table from the stage table. 
partdata3 file is attached. It has 2 partitions.

 
{code:java}
create table stage(num int, name string, category string) row format delimited 
fields terminated by ',' stored as textfile;
{code}
 

 
{code:java}
load data local inpath 'partdata3' into table stage;{code}
 

 
{code:java}
create table skewpart(num int, name string) partitioned by (category string) 
skewed by(num) on (3,4) stored as directories row format delimited fields 
terminated by ',' stored as textfile;{code}
 

 
{code:java}
insert into skewpart select * from stage;{code}
 

 

2. Verify warehouse directory table data is correct

 
{code:java}
kvenureddy@192 category=fruit % ls   
HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME    num=4
kvenureddy@192 category=fruit % pwd
/private/tmp/warehouse/external/mydb.db/skewpart/category=fruit
kvenureddy@192 category=fruit % cd num=4 
kvenureddy@192 num=4 % pwd
/private/tmp/warehouse/external/mydb.db/skewpart/category=fruit/num=4
kvenureddy@192 num=4 % cat 00_0 
4,cherry
kvenureddy@192 num=4 % cd ../
kvenureddy@192 category=fruit % cd HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME 
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % cat 00_0 
1,apple
2,banana
6,mango
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % cd../../
zsh: no such file or directory: cd../../
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % cd ../../
kvenureddy@192 skewpart % pwd
/private/tmp/warehouse/external/mydb.db/skewpart
kvenureddy@192 skewpart % ls
category=fruit        category=vegetable
kvenureddy@192 skewpart % cd category=vegetable 
kvenureddy@192 category=vegetable % pwd
/private/tmp/warehouse/external/mydb.db/skewpart/category=vegetable
kvenureddy@192 category=vegetable % ls
HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME    num=3
kvenureddy@192 category=vegetable % cd num=3 
kvenureddy@192 num=3 % cat 00_0 
3,carrot
kvenureddy@192 num=3 % cd ../
kvenureddy@192 category=vegetable % cd HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME 
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % cat 00_0 
5,potato
7,tomato
kvenureddy@192 HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME % {code}
 

 

3. Verify HMS backing datastore tables after creating and loading into 
partition+skewed table.

Note: Tables having issue(SDS, SERDES, SERDE_PARAMS, SKEWED_COL_NAMES, 
SKEWED_COL_VALUE_LOC_MAP, SKEWED_VALUES) are shown below.

SD_ID=2 row is added during create table. SD_ID=3 and 4 rows were added 
[^partdata3] there are 2 partitions during data load.
||SDS||
||SD_ID||CD_ID||INPUT_FORMAT||IS_COMPRESSED||IS_STOREDASSUBDIRECTORIES||LOCATION||NUM_BUCKETS||OUTPUT_FORMAT||SERDE_ID||
|1|1|org.apache.hadoop.mapred.TextInputFormat|0|0|file:/tmp/warehouse/external/mydb.db/stage|-1|org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat|1|
|{color:#4c9aff}2{color}|{color:#4c9aff}2{color}|{color:#4c9aff}org.apache.hadoop.mapred.TextInputFormat{color}|{color:#4c9aff}0{color}|{color:#4c9aff}1{color}|{color:#4c9aff}file:/tmp/warehouse/external/mydb.db/skewpart{color}|{color:#4c9aff}-1{color}|{color:#4c9aff}org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat{color}|{color:#4c9aff}2{color}|
|{color:#4c9aff}3{color}|{color:#4c9aff}2{color}|{color:#4c9aff}org.apache.hadoop.mapred.TextInputFormat{color}|{color:#4c9aff}0{color}|{color:#4c9aff}1{color}|{color:#4c9aff}file:/tmp/warehouse/external/mydb.db/skewpart/category=vegetable{color}|{color:#4c9aff}-1{color}|{color:#4c9aff}org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat{color}|{color:#4c9aff}3{color}|
|{color:#4c9aff}4{color}|{color:#4c9aff}2{color}|{color:#4c9aff}org.apache.hadoop.mapred.TextInputFormat{color}|{color:#4c9aff}0{color}|{color:#4c9aff}1{color}|{color:#4c9aff}file:/tmp/warehouse/external/mydb.db/skewpart/category=fruit{color}|{color:#4c9aff}-1{color}|{color:#4c9aff}org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat{color}|{color:#4c9aff}4{color}|

||SERDES||

< 5 6 7 8 9 10 11 12 13 14 >

901 - 1000 of 138311 matches

Mail list logo