[jira] [Work logged] (HIVE-25230) add position and occurrence to instr()

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25230?focusedWorklogId=770460=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770460
 ]

ASF GitHub Bot logged work on HIVE-25230:
-

Author: ASF GitHub Bot
Created on: 14/May/22 02:28
Start Date: 14/May/22 02:28
Worklog Time Spent: 10m 
  Work Description: stiga-huang commented on PR #2378:
URL: https://github.com/apache/hive/pull/2378#issuecomment-1126616115

   Thanks for your review, @dengzhhu653 !




Issue Time Tracking
---

Worklog Id: (was: 770460)
Time Spent: 1h 50m  (was: 1h 40m)

> add position and occurrence to instr()
> --
>
> Key: HIVE-25230
> URL: https://issues.apache.org/jira/browse/HIVE-25230
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 4.0.0-alpha-2
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Current instr() only supports two arguments:
> {code:java}
> instr(str, substr) - Returns the index of the first occurance of substr in str
> {code}
> Other systems (Vertica, Oracle, Impala etc) support additional position and 
> occurrence arguments:
> {code:java}
> instr(str, substr[, pos[, occurrence]])
> {code}
> Oracle doc: 
> [https://docs.oracle.com/database/121/SQLRF/functions089.htm#SQLRF00651]
> It'd be nice to support this as well. Otherwise, it's a SQL difference 
> between Impala and Hive.
>  Impala supports this in IMPALA-3973



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25335) Unreasonable setting reduce number, when join big size table(but small row count) and small size table

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25335?focusedWorklogId=770442=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770442
 ]

ASF GitHub Bot logged work on HIVE-25335:
-

Author: ASF GitHub Bot
Created on: 14/May/22 00:25
Start Date: 14/May/22 00:25
Worklog Time Spent: 10m 
  Work Description: zhengchenyu opened a new pull request, #3292:
URL: https://github.com/apache/hive/pull/3292

   I found an application which is slow in our cluster, because the proccess 
bytes of one reduce is very huge, but only two reduce.
   when I debug, I found the reason. Because in this sql, one big size table 
(about 30G) with few row count(about 3.5M), another small size table (about 
100M) have more row count (about 3.6M). So JoinStatsRule.process only use 100M 
to estimate reducer's number. But we need to process 30G byte in fact.
   
   https://issues.apache.org/jira/browse/HIVE-25335




Issue Time Tracking
---

Worklog Id: (was: 770442)
Time Spent: 2h 40m  (was: 2.5h)

> Unreasonable setting reduce number, when join big size table(but small row 
> count) and small size table
> --
>
> Key: HIVE-25335
> URL: https://issues.apache.org/jira/browse/HIVE-25335
> Project: Hive
>  Issue Type: Improvement
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25335.001.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I found an application which is slow in our cluster, because the proccess 
> bytes of one reduce is very huge, but only two reduce. 
> when I debug, I found the reason. Because in this sql, one big size table 
> (about 30G) with few row count(about 3.5M), another small size table (about 
> 100M) have more row count (about 3.6M). So JoinStatsRule.process only use 
> 100M to estimate reducer's number. But we need to  process 30G byte in fact.  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25971) Tez task shutdown getting delayed due to cached thread pool not closed

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25971?focusedWorklogId=770441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770441
 ]

ASF GitHub Bot logged work on HIVE-25971:
-

Author: ASF GitHub Bot
Created on: 14/May/22 00:24
Start Date: 14/May/22 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3046: 
HIVE-25971: Closing the thread pool created for async cache
URL: https://github.com/apache/hive/pull/3046




Issue Time Tracking
---

Worklog Id: (was: 770441)
Time Spent: 2h 50m  (was: 2h 40m)

> Tez task shutdown getting delayed due to cached thread pool not closed
> --
>
> Key: HIVE-25971
> URL: https://issues.apache.org/jira/browse/HIVE-25971
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.4.0, 3.1.2
>Reporter: Shailesh Gupta
>Assignee: Shailesh Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0, 4.0.0-alpha-1
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> We are using 
> a[CachedThreadPool|https://github.com/apache/hive/blob/branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ObjectCache.java]
>  but not closing it. CachedThreadPool creates non daemon threads, causing the 
> Tez Task JVM shutdown delayed upto 1 min, as default idle timeout is 1 min.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26219) Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy

2022-05-13 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-26219.
-
Resolution: Fixed

> Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy
> --
>
> Key: HIVE-26219
> URL: https://issues.apache.org/jira/browse/HIVE-26219
> Project: Hive
>  Issue Type: Task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  Encapsulate the API change for FileUtils.isActionPermittedForFileHierarchy 
> so other services like Ranger can still use the old API



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26079) Upgrade protobuf to 3.16.1

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26079?focusedWorklogId=770270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770270
 ]

ASF GitHub Bot logged work on HIVE-26079:
-

Author: ASF GitHub Bot
Created on: 13/May/22 16:20
Start Date: 13/May/22 16:20
Worklog Time Spent: 10m 
  Work Description: Noremac201 opened a new pull request, #3291:
URL: https://github.com/apache/hive/pull/3291

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 770270)
Time Spent: 40m  (was: 0.5h)

> Upgrade protobuf to 3.16.1
> --
>
> Key: HIVE-26079
> URL: https://issues.apache.org/jira/browse/HIVE-26079
> Project: Hive
>  Issue Type: Task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Upgrade com.google.protobuf:protobuf-java from 2.5.0 to 3.16.1 to fix 
> CVE-2021-22569



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)

2022-05-13 Thread Sylwester Lachiewicz (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536745#comment-17536745
 ] 

Sylwester Lachiewicz commented on HIVE-26226:
-

thx for output - now I can also reproduce output with Maven 3.8.1 and Java 8. 

So, only hbase-annotations 1.1.1 have a direct dependency on jdk.tools 1.7. All 
others have dependency defined in the profile for Java 7 or 8. 

Because we compile with Java 11 dependency isn't required. 

It's safe to exclude one from hive-metastore (2.3.3) because here, inside 
upgrade-acid we also have 2nd instance that comes from hadoop-common (2.7.2) - 
hadoop-annotations and then jdk.tools with 2 profiles (7,8)

> Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
> 
>
> Key: HIVE-26226
> URL: https://issues.apache.org/jira/browse/HIVE-26226
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Sylwester Lachiewicz
>Priority: Minor
> Attachments: jdktools_deps_master.txt
>
>
> The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary 
> dependency - that blocks the possibility to compile with newer java versions 
> > 8



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)

2022-05-13 Thread Alessandro Solimando (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536713#comment-17536713
 ] 

Alessandro Solimando commented on HIVE-26226:
-

I have similar results as [~zabetak], I am using the same maven version.

It is also true that compiling with JDK11 did not trigger other errors, as 
[~slachiewicz] says.

> Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
> 
>
> Key: HIVE-26226
> URL: https://issues.apache.org/jira/browse/HIVE-26226
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Sylwester Lachiewicz
>Priority: Minor
> Attachments: jdktools_deps_master.txt
>
>
> The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary 
> dependency - that blocks the possibility to compile with newer java versions 
> > 8



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26227) Add support of catalog related statements for Hive ql

2022-05-13 Thread Wechar (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536689#comment-17536689
 ] 

Wechar commented on HIVE-26227:
---

Sure [~zabetak]. We plan to provide a unified metadata management service 
through the Hive metastore, which means the metadata of various systems are 
stored in Hive metastore and divided by catalog. 
Currently we want to manage the metadata from Hive, Hbase, Kafka, Jdbc, etc, 
and computing engines like Hive, Spark, Presto, Flink can join data from 
different systems based on the metadata in Hive metastore.

> Add support of catalog related statements for Hive ql
> -
>
> Key: HIVE-26227
> URL: https://issues.apache.org/jira/browse/HIVE-26227
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Catalog concept is proposed to Hive 3.0 to allow different systems to connect 
> to different catalogs in the metastore. But so far we can not query catalog 
> through Hive ql, this task aims to implement the ddl statements related to 
> catalog.
> *Create Catalog*
> {code:sql}
> CREATE CATALOG [IF NOT EXISTS] catalog_name
> LOCATION hdfs_path
> [COMMENT catalog_comment];
> {code}
> LOCATION is required for creating a new catalog now.
> *Alter Catalog*
> {code:sql}
> ALTER CATALOG catalog_name SET LOCATION hdfs_path;
> {code}
> Only location metadata can be altered for catalog.
> *Drop Catalog*
> {code:sql}
> DROP CATALOG [IF EXISTS] catalog_name;
> {code}
> DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there 
> are non-default databases in the catalog.
> *Show Catalogs*
> {code:sql}
> SHOW CATALOGS [LIKE 'identifier_with_wildcards'];
> {code}
> SHOW CATALOGS lists all of the catalogs defined in the metastore.
> The optional LIKE clause allows the list of catalogs to be filtered using a 
> regular expression.
> *Describe Catalog*
> {code:sql}
> DESC[RIBE] CATALOG [EXTENDED] cat_name;
> {code}
> DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been 
> set), and its root location on the filesystem.
> EXTENDED also shows the create time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)

2022-05-13 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536662#comment-17536662
 ] 

Stamatis Zampetakis commented on HIVE-26226:


I don't think anybody is blind but I guess we have a different environment. I 
am using Apache Maven 3.6.3. 

 
{noformat}
mvn dependency:tree -Dincludes=jdk.tools:jdk.tools > 
jdktools_deps_master.txt{noformat}
The results can be found here: [^jdktools_deps_master.txt]

 

> Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
> 
>
> Key: HIVE-26226
> URL: https://issues.apache.org/jira/browse/HIVE-26226
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Sylwester Lachiewicz
>Priority: Minor
> Attachments: jdktools_deps_master.txt
>
>
> The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary 
> dependency - that blocks the possibility to compile with newer java versions 
> > 8



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)

2022-05-13 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26226:
---
Attachment: jdktools_deps_master.txt

> Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
> 
>
> Key: HIVE-26226
> URL: https://issues.apache.org/jira/browse/HIVE-26226
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Sylwester Lachiewicz
>Priority: Minor
> Attachments: jdktools_deps_master.txt
>
>
> The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary 
> dependency - that blocks the possibility to compile with newer java versions 
> > 8



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)

2022-05-13 Thread Sylwester Lachiewicz (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536649#comment-17536649
 ] 

Sylwester Lachiewicz commented on HIVE-26226:
-

I'm blind or have an issue with my env - but I found only one and fixed this PR 
#3284 would you be able to add more commits where this dependency exists?

Also - an alternative way to fix this may be to update the used HBASE from 
1.1.1 to 1.2.0 but then the impact may be bigger.

 

> Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
> 
>
> Key: HIVE-26226
> URL: https://issues.apache.org/jira/browse/HIVE-26226
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Sylwester Lachiewicz
>Priority: Minor
>
> The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary 
> dependency - that blocks the possibility to compile with newer java versions 
> > 8



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HIVE-26214) Hive 3.1.3 Release Notes

2022-05-13 Thread Anmol Sundaram (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536615#comment-17536615
 ] 

Anmol Sundaram edited comment on HIVE-26214 at 5/13/22 12:35 PM:
-

Hello [~zabetak] , yes thats correct. The Release Notes include some of the 
JIIRAs that are unresolved ( HIVE-25567 ) while some of the commits are 
missing, which leads to confusion on relying on the release notes


was (Author: JIRAUSER288438):
Hello [~zabetak] , yes thats correct. The Release Notes include some of the 
JIIRAs that are unresolved ( HIVE-25567 ) while some of the commits are 
missing, which leads to confusion 

> Hive 3.1.3 Release Notes
> 
>
> Key: HIVE-26214
> URL: https://issues.apache.org/jira/browse/HIVE-26214
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation, Hive
>Affects Versions: 3.1.3
>Reporter: Anmol Sundaram
>Priority: Minor
>
> The Hive Release Notes as mentioned in 
> [here|https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346277=Html=12310843]
>  does not seem to be accurate when compared with the [commit 
> logs|https://github.com/apache/hive/commits/rel/release-3.1.3]. 
> Can we please get this updated, if applicable ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26214) Hive 3.1.3 Release Notes

2022-05-13 Thread Anmol Sundaram (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536615#comment-17536615
 ] 

Anmol Sundaram commented on HIVE-26214:
---

Hello [~zabetak] , yes thats correct. The Release Notes include some of the 
JIIRAs that are unresolved ( HIVE-25567 ) while some of the commits are 
missing, which leads to confusion 

> Hive 3.1.3 Release Notes
> 
>
> Key: HIVE-26214
> URL: https://issues.apache.org/jira/browse/HIVE-26214
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation, Hive
>Affects Versions: 3.1.3
>Reporter: Anmol Sundaram
>Priority: Minor
>
> The Hive Release Notes as mentioned in 
> [here|https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346277=Html=12310843]
>  does not seem to be accurate when compared with the [commit 
> logs|https://github.com/apache/hive/commits/rel/release-3.1.3]. 
> Can we please get this updated, if applicable ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25444) Make tables based on storage handlers authorization (HIVE-24705) configurable.

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25444?focusedWorklogId=770162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770162
 ]

ASF GitHub Bot logged work on HIVE-25444:
-

Author: ASF GitHub Bot
Created on: 13/May/22 12:24
Start Date: 13/May/22 12:24
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request, #3290:
URL: https://github.com/apache/hive/pull/3290

   Resurrecting https://github.com/apache/hive/pull/2583 :
   Make tables based on storage handlers authorization (HIVE-24705) 
configurable.
   cc: @saihemanth-cloudera 




Issue Time Tracking
---

Worklog Id: (was: 770162)
Time Spent: 1h  (was: 50m)

> Make tables based on storage handlers authorization (HIVE-24705) configurable.
> --
>
> Key: HIVE-25444
> URL: https://issues.apache.org/jira/browse/HIVE-25444
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Using a config "hive.security.authorization.tables.on.storagehandlers" with 
> default true, we'll enable the authorization on storage handlers by default. 
> Authorization is disabled if this config is set to false. 
> Background: Previously, whenever a user is trying to create a table based on 
> a storage handler, the end user we are seeing in the external storage (Ex: 
> hbase, kafka, and druid) is ‘hive’ so we cannot really enforce the condition 
> in ranger on the end-user.
> https://issues.apache.org/jira/browse/HIVE-24705 solved this security issue, 
> by enforcing a check in Apache ranger for hive service. This patch had 
> changes in both hive and ranger. (ranger client depends on hive changes). Now 
> the reason why we to make this feature configurable is that users can update 
> hive code but not ranger code. In that case, users see a permission denied 
> error when executing a statement like: {{CREATE TABLE hive_table_0(key int, 
> value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'}} 
> but user/admin cannot add a ranger policy in the hive because ranger code is 
> not updated. By making this feature configurable,  we’ll unblock users from 
> creating tables based on storage handlers as they were previously doing.
> Users can turn 'off' this config if they don't have updated the ranger code.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26205) Remove the incorrect org.slf4j dependency in kafka-handler

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26205?focusedWorklogId=770154=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770154
 ]

ASF GitHub Bot logged work on HIVE-26205:
-

Author: ASF GitHub Bot
Created on: 13/May/22 12:08
Start Date: 13/May/22 12:08
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on PR #3272:
URL: https://github.com/apache/hive/pull/3272#issuecomment-1125988614

   Yes, these exclusions of slf4j-api are also defined in parent pom, can also 
be removed.




Issue Time Tracking
---

Worklog Id: (was: 770154)
Time Spent: 0.5h  (was: 20m)

> Remove the incorrect org.slf4j dependency in kafka-handler
> --
>
> Key: HIVE-26205
> URL: https://issues.apache.org/jira/browse/HIVE-26205
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Get a compile error while executing:
> {code:bash}
> mvn clean install -DskipTests
> {code}
> The error message is:
> {code:bash}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile 
> (default-compile) on project kafka-handler: Compilation failure: Compilation 
> failure: 
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[53,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[54,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[73,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaStorageHandler
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[37,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[47,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class 
> org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaJsonSerDe.java:[63,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[35,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[50,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.SimpleKafkaWriter
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[34,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[43,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaOutputFormat
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[24,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[34,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.RetryUtils
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[51,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[65,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaScanTrimmer
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[45,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[65,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class 
> 

[jira] [Updated] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25976:
--
Labels: pull-request-available  (was: )

> Cleaner may remove files being accessed from a fetch-task-converted reader
> --
>
> Key: HIVE-25976
> URL: https://issues.apache.org/jira/browse/HIVE-25976
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
> Attachments: fetch_task_conv_compactor_test.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> in a nutshell the following happens:
> * query is compiled in fetch-task-converted mode
> * no real execution happensbut the locks are released
> * the HS2 is communicating with the client and uses the fetch-task to get the 
> rows - which in this case will directly read files from the table's 
> directory
> * client sleeps between reads - so there is ample time for other events...
> * cleaner wakes up and removes some files
> * in the next read the fetch-task encounters a read error...



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25976?focusedWorklogId=770124=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770124
 ]

ASF GitHub Bot logged work on HIVE-25976:
-

Author: ASF GitHub Bot
Created on: 13/May/22 10:51
Start Date: 13/May/22 10:51
Worklog Time Spent: 10m 
  Work Description: veghlaci05 opened a new pull request, #3289:
URL: https://github.com/apache/hive/pull/3289

   
   ### What changes were proposed in this pull request?
   This PR changes the commit time of the Fetch tasks. From now on these tasks 
are committed only upon driver close.
   
   ### Why are the changes needed?
   Fetch tasks were committed inside the 
org.apache.hadoop.hive.ql.Driver#run(java.lang.String) call which was too 
early. The reading can occur only after this point, which can cause issues, if 
the table changes during the read. 
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Manually, and through unit tests




Issue Time Tracking
---

Worklog Id: (was: 770124)
Remaining Estimate: 0h
Time Spent: 10m

> Cleaner may remove files being accessed from a fetch-task-converted reader
> --
>
> Key: HIVE-25976
> URL: https://issues.apache.org/jira/browse/HIVE-25976
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: László Végh
>Priority: Major
> Attachments: fetch_task_conv_compactor_test.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> in a nutshell the following happens:
> * query is compiled in fetch-task-converted mode
> * no real execution happensbut the locks are released
> * the HS2 is communicating with the client and uses the fetch-task to get the 
> rows - which in this case will directly read files from the table's 
> directory
> * client sleeps between reads - so there is ample time for other events...
> * cleaner wakes up and removes some files
> * in the next read the fetch-task encounters a read error...



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-25993) Query-based compaction doesn't work when partition column type is boolean

2022-05-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Végh resolved HIVE-25993.

Resolution: Fixed

> Query-based compaction doesn't work when partition column type is boolean
> -
>
> Key: HIVE-25993
> URL: https://issues.apache.org/jira/browse/HIVE-25993
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Query based compaction fails on tables with boolean partition column.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26026) Use the new "REFUSED" compaction state where it makes sense

2022-05-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Végh resolved HIVE-26026.

Resolution: Fixed

> Use the new "REFUSED" compaction state where it makes sense
> ---
>
> Key: HIVE-26026
> URL: https://issues.apache.org/jira/browse/HIVE-26026
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> The 
> org.apache.hadoop.hive.ql.txn.compactor.Worker#findNextCompactionAndExecute 
> method does several checks (The table/partition exists, is not sorted, there 
> are enough files to compact, etc.) before it actually executes the compaction 
> request. If the compaction request fails on any of these checks, it is put to 
> "SUCCEEDED" state which is often misleading for users. SHOW COMPACTIONS will 
> show these requests as succeeded without an error, while the table is not 
> compacted at all.
> For these cases, the state should be "REFUSED" instead of "SUCCEEDED" among 
> with the appropriate error message.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26059) Eventually clean compactions in "refused" state from compaction history

2022-05-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Végh resolved HIVE-26059.

Resolution: Fixed

> Eventually clean compactions in "refused" state from compaction history
> ---
>
> Key: HIVE-26059
> URL: https://issues.apache.org/jira/browse/HIVE-26059
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Compactions in states succeeded, failed, and did not initiate have a 
> retention threshold (example: 
> metastore.compactor.history.retention.succeeded) and are purged from 
> COMPLETED_COMPACTIONS if the number of compactions in this state per 
> partition/unpartitioned table passes the threshold. This keeps the size of 
> COMPLETED_COMPACTIONS in check.
> We should also purge refused compactions from COMPLETED_COMPACTIONS.
> See:
> CompactionTxnHandler#purgeCompactionHistory
> ! Also: REFUSED_RESPONSE should be added to 
> org.apache.hadoop.hive.metastore.txn.TxnStore#COMPACTION_STATES so that 
> metrics will be collected about it.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26226) Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)

2022-05-13 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536549#comment-17536549
 ] 

Stamatis Zampetakis commented on HIVE-26226:


If you check the current master you will see multiple usages of jdk.tools.

> Remove outdated dependency to jdk.tools:jdk.tools:jar:1.7 (upgrade-acid)
> 
>
> Key: HIVE-26226
> URL: https://issues.apache.org/jira/browse/HIVE-26226
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Sylwester Lachiewicz
>Priority: Minor
>
> The hive-metastore 2.3.3 used in upgrade-acid tests includes unnecessary 
> dependency - that blocks the possibility to compile with newer java versions 
> > 8



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=770111=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770111
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 13/May/22 10:10
Start Date: 13/May/22 10:10
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r872219527


##
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java:
##
@@ -315,18 +320,19 @@ public void testOutputFormat() throws Throwable {
 
 // Check permisssion on partition dirs and files created
 for (int i = 0; i < tableNames.length; i++) {
-  Path partitionFile = new Path(warehousedir + "/" + tableNames[i]
-+ "/ds=1/cluster=ag/part-m-0");
-  FileSystem fs = partitionFile.getFileSystem(mrConf);
-  Assert.assertEquals("File permissions of table " + tableNames[i] + " is 
not correct",
-fs.getFileStatus(partitionFile).getPermission(),
-new FsPermission(tablePerms[i]));
-  Assert.assertEquals("File permissions of table " + tableNames[i] + " is 
not correct",
-fs.getFileStatus(partitionFile.getParent()).getPermission(),
-new FsPermission(tablePerms[i]));
-  Assert.assertEquals("File permissions of table " + tableNames[i] + " is 
not correct",
-
fs.getFileStatus(partitionFile.getParent().getParent()).getPermission(),
-new FsPermission(tablePerms[i]));
+  final Path partitionFile = new Path(warehousedir + "/" + tableNames[i] + 
"/ds=1/cluster=ag/part-m-0");
+  final Path grandParentOfPartitionFile = partitionFile.getParent();

Review Comment:
   Changed. I picked it as is from the previous PR, when I saw this test 
failing :-) 





Issue Time Tracking
---

Worklog Id: (was: 770111)
Time Spent: 9h 53m  (was: 9h 43m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 53m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=770110=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770110
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 13/May/22 10:09
Start Date: 13/May/22 10:09
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on PR #3279:
URL: https://github.com/apache/hive/pull/3279#issuecomment-1125879377

   The only test failure here is intermittent. Have answered/addressed all the 
comments. 
   One test I have disabled, firstly it was not failing itself but was 
corrupting the XML, it wasn't functional, but some test infra stuff and relying 
on hadoop ls command, for which output was intermittently changing. Not a good 
test to have either.
   For record it is this:
   
http://ci.hive.apache.org/job/hive-precommit/job/PR-3279/7/testReport/junit/TEST-org.apache.hadoop.hive.cli.split0.TestMiniLlapLocalCliDriver/xml/_failed_to_read_/
   
   Can't decode the failure reason here, it was that broken test which was 
causing this. 
   If everything is good here, and only this test block. I will have a followup 
jira and figure this test out with the original author of the test.
   
   I have tried basic stuff with Hive-On-MR.




Issue Time Tracking
---

Worklog Id: (was: 770110)
Time Spent: 9h 43m  (was: 9.55h)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 43m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=770105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770105
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 13/May/22 10:04
Start Date: 13/May/22 10:04
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r872214761


##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##
@@ -123,57 +122,24 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
-List dumpWithClause = Arrays.asList(
-"'hive.repl.add.raw.reserved.namespace'='true'",
-"'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + 
"'='"
-+ replica.externalTableWarehouseRoot + "'",
-"'distcp.options.skipcrccheck'=''",
-"'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname + 
"'='false'",
-"'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='"
-+ UserGroupInformation.getCurrentUser().getUserName() 
+"'");
-WarehouseInstance.Tuple tuple =
-primary.run("use " + primaryDbName)
-.run("create table encrypted_table (id int, value string)")
-.run("insert into table encrypted_table values 
(1,'value1')")
-.run("insert into table encrypted_table values 
(2,'value2')")
-.dump(primaryDbName, dumpWithClause);
-
-replica
-.run("repl load " + primaryDbName + " into " + replicatedDbName
-+ " with('hive.repl.add.raw.reserved.namespace'='true', "
-+ "'hive.repl.replica.external.table.base.dir'='" + 
replica.externalTableWarehouseRoot + "', "
-+ "'hive.exec.copyfile.maxsize'='0', 
'distcp.options.skipcrccheck'='')")
-.run("use " + replicatedDbName)
-.run("repl status " + replicatedDbName)
-.verifyResult(tuple.lastReplicationId);
-
-try {
-  replica
-  .run("select value from encrypted_table")
-  .verifyResults(new String[] { "value1", "value2" });
-  Assert.fail("Src EZKey shouldn't be present on target");
-} catch (IOException e) {
-  Assert.assertTrue(e.getCause().getMessage().contains("KeyVersion name 
'test_key@0' does not exist"));
-}
-
 //read should pass without raw-byte distcp
-dumpWithClause = Arrays.asList( "'" + 
HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
+List dumpWithClause = Arrays.asList( "'" + 
HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
 + replica.externalTableWarehouseRoot + "'");
-tuple = primary.run("use " + primaryDbName)
+WarehouseInstance.Tuple tuple =
+primary.run("use " + primaryDbName)
 .run("create external table encrypted_table2 (id int, value 
string)")
 .run("insert into table encrypted_table2 values (1,'value1')")
 .run("insert into table encrypted_table2 values (2,'value2')")
 .dump(primaryDbName, dumpWithClause);
 
 replica
-.run("repl load " + primaryDbName + " into " + replicatedDbName
-+ " with('hive.repl.replica.external.table.base.dir'='" + 
replica.externalTableWarehouseRoot + "', "
-+ "'hive.exec.copyfile.maxsize'='0', 
'distcp.options.skipcrccheck'='')")
-.run("use " + replicatedDbName)
-.run("repl status " + replicatedDbName)
-.verifyResult(tuple.lastReplicationId)

Review Comment:
   DistCp itself fails, It is running with hive.repl.add.raw.reserved.namespace 
and you can't copy if the key is not present on target cluster. Earlier I 
converted this to a failure case test, but then the next iteration fails which 
is without hive.repl.add.raw.reserved.namespace because the last load wasn't 
successful, so I kept the success case



##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##
@@ -123,57 +122,24 @@ public void 
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
   put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
 }}, "test_key123");
 
-List dumpWithClause = Arrays.asList(

Review Comment:
   Same as above:
   DistCp itself fails, It is running with hive.repl.add.raw.reserved.namespace 
and you can't copy if the key is not present on target cluster. Earlier I 
converted this to a failure case test, but then the next iteration fails which 
is without hive.repl.add.raw.reserved.namespace because the 

[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=770104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770104
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 13/May/22 10:04
Start Date: 13/May/22 10:04
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r872214538


##
streaming/src/test/org/apache/hive/streaming/TestStreaming.java:
##
@@ -1317,6 +1318,11 @@ public void testTransactionBatchEmptyCommit() throws 
Exception {
 connection.close();
   }
 
+  /**
+   * Starting with HDFS 3.3.1, the underlying system NOW SUPPORTS hflush so 
this
+   * test fails.

Review Comment:
   Sure, I have removed the exception assertion. Kept the reason as is.
   Just for code context, why HFlush support gets rid of the exception
   ```
   if (!out.hasCapability(StreamCapabilities.HFLUSH)) {
 throw new ConnectionError(
 "The backing filesystem only supports transaction batch 
sizes of 1, but " + transactionBatchSize
 + " was requested.");
   }
   ```



##
common/pom.xml:
##
@@ -195,6 +194,11 @@
   tez-api
   ${tez.version}
 
+
+  org.fusesource.jansi
+  jansi
+  2.3.4

Review Comment:
   Done





Issue Time Tracking
---

Worklog Id: (was: 770104)
Time Spent: 9h 23m  (was: 9h 13m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 23m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26205) Remove the incorrect org.slf4j dependency in kafka-handler

2022-05-13 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536538#comment-17536538
 ] 

Stamatis Zampetakis commented on HIVE-26205:


I am using maven 3.6.3 and the compilation does not fail. Any ideas of why it 
fails only with maven 3.8.5?

> Remove the incorrect org.slf4j dependency in kafka-handler
> --
>
> Key: HIVE-26205
> URL: https://issues.apache.org/jira/browse/HIVE-26205
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Get a compile error while executing:
> {code:bash}
> mvn clean install -DskipTests
> {code}
> The error message is:
> {code:bash}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile 
> (default-compile) on project kafka-handler: Compilation failure: Compilation 
> failure: 
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[53,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[54,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[73,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaStorageHandler
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[37,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[47,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class 
> org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaJsonSerDe.java:[63,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[35,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[50,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.SimpleKafkaWriter
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[34,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[43,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaOutputFormat
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[24,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[34,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.RetryUtils
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[51,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[65,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaScanTrimmer
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[45,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[65,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class 
> org.apache.hadoop.hive.kafka.TransactionalKafkaWriter
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/HiveKafkaProducer.java:[37,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/HiveKafkaProducer.java:[59,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> 

[jira] [Commented] (HIVE-26227) Add support of catalog related statements for Hive ql

2022-05-13 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536533#comment-17536533
 ] 

Stamatis Zampetakis commented on HIVE-26227:


Thanks for working on this [~wechar] ! Out of curiosity can you provide a few 
more details on how do you plan to use this feature?

> Add support of catalog related statements for Hive ql
> -
>
> Key: HIVE-26227
> URL: https://issues.apache.org/jira/browse/HIVE-26227
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Wechar
>Assignee: Wechar
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Catalog concept is proposed to Hive 3.0 to allow different systems to connect 
> to different catalogs in the metastore. But so far we can not query catalog 
> through Hive ql, this task aims to implement the ddl statements related to 
> catalog.
> *Create Catalog*
> {code:sql}
> CREATE CATALOG [IF NOT EXISTS] catalog_name
> LOCATION hdfs_path
> [COMMENT catalog_comment];
> {code}
> LOCATION is required for creating a new catalog now.
> *Alter Catalog*
> {code:sql}
> ALTER CATALOG catalog_name SET LOCATION hdfs_path;
> {code}
> Only location metadata can be altered for catalog.
> *Drop Catalog*
> {code:sql}
> DROP CATALOG [IF EXISTS] catalog_name;
> {code}
> DROP CATALOG is always RESTRICT, which means DROP CATALOG will fail if there 
> are non-default databases in the catalog.
> *Show Catalogs*
> {code:sql}
> SHOW CATALOGS [LIKE 'identifier_with_wildcards'];
> {code}
> SHOW CATALOGS lists all of the catalogs defined in the metastore.
> The optional LIKE clause allows the list of catalogs to be filtered using a 
> regular expression.
> *Describe Catalog*
> {code:sql}
> DESC[RIBE] CATALOG [EXTENDED] cat_name;
> {code}
> DESCRIBE CATALOG shows the name of the catalog, its comment (if one has been 
> set), and its root location on the filesystem.
> EXTENDED also shows the create time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location

2022-05-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-26157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536462#comment-17536462
 ] 

László Pintér commented on HIVE-26157:
--

Merged into master. Thanks, [~pvary] for the review!

> Change Iceberg storage handler authz URI to metadata location
> -
>
> Key: HIVE-26157
> URL: https://issues.apache.org/jira/browse/HIVE-26157
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> In HIVE-25964, the authz URI has been changed to "iceberg://db.table".
> It is possible to set the metadata pointers of table A to point to table B, 
> and therefore you could read table B's data via querying table A.
> {code:sql}
> alter table A set tblproperties 
> ('metadata_location'='/path/to/B/snapshot.json', 
> 'previous_metadata_location'='/path/to/B/prev_snapshot.json');  {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location

2022-05-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér resolved HIVE-26157.
--
Resolution: Fixed

> Change Iceberg storage handler authz URI to metadata location
> -
>
> Key: HIVE-26157
> URL: https://issues.apache.org/jira/browse/HIVE-26157
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> In HIVE-25964, the authz URI has been changed to "iceberg://db.table".
> It is possible to set the metadata pointers of table A to point to table B, 
> and therefore you could read table B's data via querying table A.
> {code:sql}
> alter table A set tblproperties 
> ('metadata_location'='/path/to/B/snapshot.json', 
> 'previous_metadata_location'='/path/to/B/prev_snapshot.json');  {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location

2022-05-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26157?focusedWorklogId=770038=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770038
 ]

ASF GitHub Bot logged work on HIVE-26157:
-

Author: ASF GitHub Bot
Created on: 13/May/22 06:44
Start Date: 13/May/22 06:44
Worklog Time Spent: 10m 
  Work Description: lcspinter merged PR #3226:
URL: https://github.com/apache/hive/pull/3226




Issue Time Tracking
---

Worklog Id: (was: 770038)
Time Spent: 4h 20m  (was: 4h 10m)

> Change Iceberg storage handler authz URI to metadata location
> -
>
> Key: HIVE-26157
> URL: https://issues.apache.org/jira/browse/HIVE-26157
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> In HIVE-25964, the authz URI has been changed to "iceberg://db.table".
> It is possible to set the metadata pointers of table A to point to table B, 
> and therefore you could read table B's data via querying table A.
> {code:sql}
> alter table A set tblproperties 
> ('metadata_location'='/path/to/B/snapshot.json', 
> 'previous_metadata_location'='/path/to/B/prev_snapshot.json');  {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)