[jira] [Work logged] (HIVE-26201) QueryResultsCache may have wrong permission if umask is too strict

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26201?focusedWorklogId=768322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768322
 ]

ASF GitHub Bot logged work on HIVE-26201:
-

Author: ASF GitHub Bot
Created on: 10/May/22 05:46
Start Date: 10/May/22 05:46
Worklog Time Spent: 10m 
  Work Description: skysiders commented on PR #3267:
URL: https://github.com/apache/hive/pull/3267#issuecomment-1121956685

   Hi @abstractdog , Could you please have a look at this?It same as 
[TEZ-4412](https://github.com/apache/tez/pull/209).




Issue Time Tracking
---

Worklog Id: (was: 768322)
Time Spent: 20m  (was: 10m)

> QueryResultsCache may have wrong permission if umask is too strict
> --
>
> Key: HIVE-26201
> URL: https://issues.apache.org/jira/browse/HIVE-26201
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Tez
>Affects Versions: 3.1.3
>Reporter: Zhang Dongsheng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TezSessionState, QueryResultsCache and Context use mkdirs(path, permission) 
> to create directory with special permission. But If the umask is too 
> restrictive, permissions may not work as expected. So we need to check if 
> permission is set as expected.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26174) disable rename table across dbs when on different filesystem

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26174?focusedWorklogId=768291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768291
 ]

ASF GitHub Bot logged work on HIVE-26174:
-

Author: ASF GitHub Bot
Created on: 10/May/22 02:48
Start Date: 10/May/22 02:48
Worklog Time Spent: 10m 
  Work Description: adrian-wang commented on PR #3240:
URL: https://github.com/apache/hive/pull/3240#issuecomment-1121829345

   Hi @ayushtkn , In our customers' scenario, they want to put table data in 
HDFS, while some stale tables will be archived to cloud store. They were using 
this command and thought that data should be placed in cloud store, but after a 
while they found it is still in HDFS. Besides, currently if we have two dbs 
located on two different HDFS services, the command would fail, Hence I think 
we should not allow rename when two dbs are on different storages.




Issue Time Tracking
---

Worklog Id: (was: 768291)
Time Spent: 20m  (was: 10m)

> disable rename table across dbs when on different filesystem
> 
>
> Key: HIVE-26174
> URL: https://issues.apache.org/jira/browse/HIVE-26174
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adrian Wang
>Assignee: Adrian Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, if we run 
> ALTER TABLE db1.table1 RENAME TO db2.table2;
> and with `db1` and `db2` on different filesystem, for example `db1` as 
> `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as 
> `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under 
> location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange.
> The idea is to ban this kind of operation, as we seem to intend to ban that, 
> but the check was done after we changed file system scheme so it was always 
> true.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25963) Temporary table creation with not null constraint gets converted to external table

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25963?focusedWorklogId=768247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768247
 ]

ASF GitHub Bot logged work on HIVE-25963:
-

Author: ASF GitHub Bot
Created on: 10/May/22 00:20
Start Date: 10/May/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on PR #3040:
URL: https://github.com/apache/hive/pull/3040#issuecomment-1121708249

   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.




Issue Time Tracking
---

Worklog Id: (was: 768247)
Time Spent: 2h 50m  (was: 2h 40m)

> Temporary table creation with not null constraint gets converted to external 
> table 
> ---
>
> Key: HIVE-25963
> URL: https://issues.apache.org/jira/browse/HIVE-25963
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When creating a temporary table with not null, constraint it gets covered to 
> external table. For example: 
> create temporary table t2 (a int not null);
> table t2' metadata looks like: 
> {code:java}
> +---+++
> |   col_name| data_type   
>|  comment   |
> +---+++
> | a | int 
>||
> |   | NULL
>| NULL   |
> | # Detailed Table Information  | NULL
>| NULL   |
> | Database: | default 
>| NULL   |
> | OwnerType:| USER
>| NULL   |
> | Owner:| sourabh 
>| NULL   |
> | CreateTime:   | Tue Feb 15 15:20:13 PST 2022
>| NULL   |
> | LastAccessTime:   | UNKNOWN 
>| NULL   |
> | Retention:| 0   
>| NULL   |
> | Location: | 
> hdfs://localhost:9000/tmp/hive/sourabh/80d374a8-cd7a-4fcf-ae72-51b04ff9c3d8/_tmp_space.db/4574446d-c144-48f9-b4b6-2e9ee0ce5be4
>  | NULL   |
> | Table Type:   | EXTERNAL_TABLE  
>| NULL   |
> | Table Parameters: | NULL
>| NULL   |
> |   | COLUMN_STATS_ACCURATE   
>| {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\"}} |
> |   | EXTERNAL
>| TRUE   |
> |   | TRANSLATED_TO_EXTERNAL  
>| TRUE   |
> |   | bucketing_version   
>| 2  |
> |   | external.table.purge
>| TRUE   |
> |   | numFiles
>| 0  |
> |   | numRows 
>| 0  |
> | 

[jira] [Work logged] (HIVE-25998) Build iceberg modules without a flag

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25998?focusedWorklogId=768245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768245
 ]

ASF GitHub Bot logged work on HIVE-25998:
-

Author: ASF GitHub Bot
Created on: 10/May/22 00:20
Start Date: 10/May/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3068: 
HIVE-25998: Build iceberg modules without a flag
URL: https://github.com/apache/hive/pull/3068




Issue Time Tracking
---

Worklog Id: (was: 768245)
Time Spent: 0.5h  (was: 20m)

> Build iceberg modules without a flag
> 
>
> Key: HIVE-25998
> URL: https://issues.apache.org/jira/browse/HIVE-25998
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We originally introduced a -Piceberg flag for building the iceberg modules.
> Since then the iceberg modules are stabilised and we would like to have a 
> release, we should remove the flag now.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25495) Upgrade to JLine3

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25495?focusedWorklogId=768244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768244
 ]

ASF GitHub Bot logged work on HIVE-25495:
-

Author: ASF GitHub Bot
Created on: 10/May/22 00:20
Start Date: 10/May/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3069: 
HIVE-25495: Upgrade JLine to version 3
URL: https://github.com/apache/hive/pull/3069




Issue Time Tracking
---

Worklog Id: (was: 768244)
Time Spent: 2h 50m  (was: 2h 40m)

> Upgrade to JLine3
> -
>
> Key: HIVE-25495
> URL: https://issues.apache.org/jira/browse/HIVE-25495
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Jline 2 has been discontinued a long while ago.  Hadoop uses JLine3 so Hive 
> should match.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-13384) Failed to create HiveMetaStoreClient object with proxy user when Kerberos enabled

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13384?focusedWorklogId=768246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768246
 ]

ASF GitHub Bot logged work on HIVE-13384:
-

Author: ASF GitHub Bot
Created on: 10/May/22 00:20
Start Date: 10/May/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3064: 
[HIVE-13384] HiveMetaStoreClient supports proxy
URL: https://github.com/apache/hive/pull/3064




Issue Time Tracking
---

Worklog Id: (was: 768246)
Time Spent: 1h  (was: 50m)

> Failed to create HiveMetaStoreClient object with proxy user when Kerberos 
> enabled
> -
>
> Key: HIVE-13384
> URL: https://issues.apache.org/jira/browse/HIVE-13384
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I wrote a Java client to talk with HiveMetaStore. (Hive 1.2.0)
> But found that it can't new a HiveMetaStoreClient object successfully via a 
> proxy user in Kerberos env.
> ===
> 15/10/13 00:14:38 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> ==
> When I debugging on Hive, I found that the error came from open() method in 
> HiveMetaStoreClient class.
> Around line 406,
>  transport = UserGroupInformation.getCurrentUser().doAs(new 
> PrivilegedExceptionAction() {  //FAILED, because the current user 
> doesn't have the cridential
> But it will work if I change above line to
>  transport = UserGroupInformation.getCurrentUser().getRealUser().doAs(new 
> PrivilegedExceptionAction() {  //PASS
> I found DRILL-3413 fixes this error in Drill side as a workaround. But if I 
> submit a mapreduce job via Pig/HCatalog, it runs into the same issue again 
> when initialize the object via HCatalog.
> It would be better to fix this issue in Hive side.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26071) JWT authentication for Thrift over HTTP in HiveMetaStore

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26071?focusedWorklogId=768226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768226
 ]

ASF GitHub Bot logged work on HIVE-26071:
-

Author: ASF GitHub Bot
Created on: 09/May/22 23:40
Start Date: 09/May/22 23:40
Worklog Time Spent: 10m 
  Work Description: hsnusonic commented on code in PR #3233:
URL: https://github.com/apache/hive/pull/3233#discussion_r868525627


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -79,8 +79,10 @@
 import org.apache.hadoop.util.ReflectionUtils;
 import org.apache.hadoop.util.StringUtils;
 import org.apache.http.HttpException;
+import org.apache.http.HttpHeaders;
 import org.apache.http.HttpRequest;
 import org.apache.http.HttpRequestInterceptor;
+import org.apache.http.client.utils.HttpClientUtils;

Review Comment:
   nit: unused import?



##
standalone-metastore/metastore-server/pom.xml:
##
@@ -311,6 +311,22 @@
   curator-test
   test
 
+
+  com.nimbusds
+  nimbus-jose-jwt
+  9.20
+
+
+  org.pac4j
+  pac4j-core
+  4.5.5
+
+
+  com.github.tomakehurst
+  wiremock-jre8-standalone
+  2.32.0

Review Comment:
   Never mind. This is only used in test, so I am OK with it.





Issue Time Tracking
---

Worklog Id: (was: 768226)
Time Spent: 6h 40m  (was: 6.5h)

> JWT authentication for Thrift over HTTP in HiveMetaStore
> 
>
> Key: HIVE-26071
> URL: https://issues.apache.org/jira/browse/HIVE-26071
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> HIVE-25575 recently added a support for JWT authentication in HS2. This Jira 
> aims to add the same feature in HMS



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26158) TRANSLATED_TO_EXTERNAL partition tables cannot query partition data after rename table

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26158?focusedWorklogId=768222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768222
 ]

ASF GitHub Bot logged work on HIVE-26158:
-

Author: ASF GitHub Bot
Created on: 09/May/22 23:32
Start Date: 09/May/22 23:32
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #3255:
URL: https://github.com/apache/hive/pull/3255#discussion_r867627190


##
ql/src/test/results/clientpositive/llap/translated_external_rename3.q.out:
##
@@ -95,15 +64,17 @@ Retention:  0
  A masked pattern was here 
 Table Type:EXTERNAL_TABLE   
 Table Parameters:   
-   COLUMN_STATS_ACCURATE   
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"a\":\"true\"}}
+   COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}

Review Comment:
   I'm wondering why the column_stats are missing here when we do a describe on 
the table.



##
ql/src/test/queries/clientpositive/translated_external_rename3.q:
##
@@ -1,26 +1,25 @@
 set 
metastore.metadata.transformer.class=org.apache.hadoop.hive.metastore.MetastoreDefaultTransformer;
-set metastore.metadata.transformer.location.mode=force;

Review Comment:
   this is also working in force mode also. Can you please give some info about 
why you had to change it to seqprefix?



##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java:
##
@@ -244,6 +244,13 @@ public static boolean isExternalTable(Table table) {
 return isExternal(params);
   }
 
+  public static boolean isTranslatedToExternalTable(Table table) {
+Map p = table.getParameters();

Review Comment:
   Can you change the name of this variable 'p' to something more meaningful 
like tblProperties?





Issue Time Tracking
---

Worklog Id: (was: 768222)
Time Spent: 20m  (was: 10m)

> TRANSLATED_TO_EXTERNAL partition tables cannot query partition data after 
> rename table
> --
>
> Key: HIVE-26158
> URL: https://issues.apache.org/jira/browse/HIVE-26158
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: tanghui
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: metastore_translator, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> After the patch is updated, the partition table location and hdfs data 
> directory are displayed normally, but the partition location of the table in 
> the SDS in the Hive metabase is still displayed as the location of the old 
> table, resulting in no data in the query partition.
>  
> in beeline:
> 
> set hive.create.as.external.legacy=true;
> CREATE TABLE part_test(
> c1 string
> ,c2 string
> )PARTITIONED BY (dat string)
> insert into part_test values ("11","th","20220101")
> insert into part_test values ("22","th","20220102")
> alter table part_test rename to part_test11;
> --this result is null.
> select * from part_test11 where dat="20220101";
> ||part_test.c1||part_test.c2||part_test.dat||
> | | | |
> -
> SDS in the Hive metabase:
> select SDS.LOCATION from TBLS,SDS where TBLS.TBL_NAME="part_test11" AND 
> TBLS.TBL_ID=SDS.CD_ID;
> ---
> |*LOCATION*|
> |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test11|
> |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220101|
> |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220102|
> ---
>  
> We need to modify the partition location of the table in SDS to ensure that 
> the query results are normal



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26215) Expose the MIN_HISTORY_LEVEL table through Hive sys database

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26215:
--
Labels: pull-request-available  (was: )

>  Expose the MIN_HISTORY_LEVEL table  through Hive sys database 
> ---
>
> Key: HIVE-26215
> URL: https://issues.apache.org/jira/browse/HIVE-26215
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should 
> expose it as a sys table so we can see what might be blocking the Cleaner 
> thread.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26215) Expose the MIN_HISTORY_LEVEL table through Hive sys database

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26215?focusedWorklogId=768215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768215
 ]

ASF GitHub Bot logged work on HIVE-26215:
-

Author: ASF GitHub Bot
Created on: 09/May/22 23:07
Start Date: 09/May/22 23:07
Worklog Time Spent: 10m 
  Work Description: simhadri-g opened a new pull request, #3275:
URL: https://github.com/apache/hive/pull/3275

   
   
   ### What changes were proposed in this pull request?
   Expose the MIN_HISTORY_LEVEL table through Hive sys database 
   
   
   
   ### Why are the changes needed?
   While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should 
expose it as a sys table so we can see what might be blocking the Cleaner 
thread.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, users will be able to check MIN_HISTORY_LEVEL through sys db.
   
   
   
   ### How was this patch tested?
   q test and schema tool.
   
   




Issue Time Tracking
---

Worklog Id: (was: 768215)
Remaining Estimate: 0h
Time Spent: 10m

>  Expose the MIN_HISTORY_LEVEL table  through Hive sys database 
> ---
>
> Key: HIVE-26215
> URL: https://issues.apache.org/jira/browse/HIVE-26215
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should 
> expose it as a sys table so we can see what might be blocking the Cleaner 
> thread.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26215) Expose the MIN_HISTORY_LEVEL table through Hive sys database

2022-05-09 Thread Simhadri G (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri G reassigned HIVE-26215:
-


>  Expose the MIN_HISTORY_LEVEL table  through Hive sys database 
> ---
>
> Key: HIVE-26215
> URL: https://issues.apache.org/jira/browse/HIVE-26215
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>
> While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should 
> expose it as a sys table so we can see what might be blocking the Cleaner 
> thread.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26009) Determine number of buckets for implicitly bucketed ACIDv2 tables

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26009?focusedWorklogId=768207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-768207
 ]

ASF GitHub Bot logged work on HIVE-26009:
-

Author: ASF GitHub Bot
Created on: 09/May/22 22:10
Start Date: 09/May/22 22:10
Worklog Time Spent: 10m 
  Work Description: simhadri-g closed pull request #3224: HIVE-26009: 
Determine number of buckets for implicitly bucketed ACIDv…
URL: https://github.com/apache/hive/pull/3224




Issue Time Tracking
---

Worklog Id: (was: 768207)
Time Spent: 20m  (was: 10m)

> Determine number of buckets for implicitly bucketed ACIDv2 tables 
> --
>
> Key: HIVE-26009
> URL: https://issues.apache.org/jira/browse/HIVE-26009
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive tries to set number of reducers equal to number of buckets here: 
> [https://github.com/apache/hive/blob/9857c4e584384f7b0a49c34bc2bdf876c2ea1503/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L6958]
>  
>  
> The numberOfBuckets for implicitly bucketed tables is set to -1 by default. 
> When this is the case, it is left to hive to estimate the number of reducers 
> required the job, based on job input, and configuration parameters.
> [https://github.com/apache/hive/blob/9857c4e584384f7b0a49c34bc2bdf876c2ea1503/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L3369]
>  
> This estimate is not optimal in all cases. In the worst case, it case result 
> in a single reducer being launched , which can lead to a significant 
> bottleneck in performance .
>  
> Ideally,  the number of reducers launched should equal to number of buckets, 
> which is the case for explicitly bucketed tables.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26154) Upgrade cron-utils to 9.1.6 for branch-3

2022-05-09 Thread Asif Saleh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Asif Saleh updated HIVE-26154:
--
Fix Version/s: 3.1.3

> Upgrade cron-utils to 9.1.6 for branch-3
> 
>
> Key: HIVE-26154
> URL: https://issues.apache.org/jira/browse/HIVE-26154
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Affects Versions: 3.1.3
>Reporter: Asif Saleh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.3
>
>
> To fix [CVE-2021-41269|https://nvd.nist.gov/vuln/detail/CVE-2021-41269] issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26154) Upgrade cron-utils to 9.1.6 for branch-3

2022-05-09 Thread Asif Saleh (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17534023#comment-17534023
 ] 

Asif Saleh commented on HIVE-26154:
---

[~ngangam] The PR you merged was on the master branch. Can you pls make the 
change on the 3.1 branch?

> Upgrade cron-utils to 9.1.6 for branch-3
> 
>
> Key: HIVE-26154
> URL: https://issues.apache.org/jira/browse/HIVE-26154
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Affects Versions: 3.1.3
>Reporter: Asif Saleh
>Priority: Major
>  Labels: pull-request-available
>
> To fix [CVE-2021-41269|https://nvd.nist.gov/vuln/detail/CVE-2021-41269] issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26213) "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it will raise an error

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26213:
---
Description: 
In hive-default.xml.template
{code:java}

  hive.limit.pushdown.memory.usage
  0.1
  
    Expects value between 0.0f and 1.0f.
    The fraction of available memory to be used for buffering rows in 
Reducesink operator for limit pushdown optimization.
  
{code}
Based on the description of hive-default.xml.template, 
hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
memory of all buffered lines for the limit pushdown optimization, and 
successfully start hiveserver2.

Then, call the java api to write a program to establish a jdbc connection as a 
client to access hive, using JDBCDemo as an example.
{code:java}
import demo.utils.JDBCUtils;
public class JDBCDemo{
public static void main(String[] args) throws Exception
{   JDBCUtils.init();   JDBCUtils.createDatabase();   
JDBCUtils.showDatabases();   JDBCUtils.createTable();   JDBCUtils.showTables(); 
  JDBCUtils.descTable();   JDBCUtils.loadData();   JDBCUtils.selectData();   
JDBCUtils.countData();   JDBCUtils.dropDatabase();   JDBCUtils.dropTable();   
JDBCUtils.destory(); }
}
{code}
After running the client program, both the client and the hiveserver throw 
exceptions.
{code:java}
2022-05-09 19:05:36: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 67a6db8d-f957-4d5d-ac18-28403adab7f3
Hive Session ID = f9f8772c-5765-4c3e-bcff-ca605c667be7
OK
OK
OK
OK
OK
OK
OK
Loading data to table default.emp
OK
FAILED: SemanticException Invalid memory usage value 1.0 for 
hive.limit.pushdown.memory.usage{code}
{code:java}
liky@ljq1:~/hive_jdbc_test$ ./startJDBC_0.sh 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/liky/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.1/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/liky/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Running: drop database if exists hive_jdbc_test
Running: create database hive_jdbc_test
Running: show databases
default
hive_jdbc_test
Running: drop table if exists emp
Running: create table emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
row format delimited fields terminated by '\t'
Running: show tables
emp
Running: desc emp
empno   int
ename   string
job     string
mgr     int
hiredate       string
sal     double
comm   double
deptno int
Running: load data local inpath '/home/liky/hiveJDBCTestData/data.txt' 
overwrite into table emp
Running: select * from emp
Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: Error 
while compiling statement: FAILED: SemanticException Invalid memory usage value 
1.0 for hive.limit.pushdown.memory.usage
      at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:380)
      at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:366)
      at 
org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:354)
      at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:293)
      at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:509)
      at demo.utils.JDBCUtils.selectData(JDBCUtils.java:98)
      at demo.test.JDBCDemo.main(JDBCDemo.java:19){code}
Setting hive.limit.pushdown.memory.usage to 0.0 has no exception.

So, setting hive.limit.pushdown.memory.usage to 1.0 is not desirable, 
*hive-default.xml.template is not clear enough for the description of the 
boundary of the value, it is better to use the interval to indicate the value 
that is [0.0,1.0).*

  was:
In hive-default.xml.template
{code:java}

  hive.limit.pushdown.memory.usage
  0.1
  
    Expects value between 0.0f and 1.0f.
    The fraction of available memory to be used for buffering rows in 
Reducesink operator for limit pushdown optimization.
  
{code}
Based on the description of hive-default.xml.template, 
hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
memory of 

[jira] [Updated] (HIVE-26213) "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it will raise an error

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26213:
---
Description: 
In hive-default.xml.template
{code:java}

  hive.limit.pushdown.memory.usage
  0.1
  
    Expects value between 0.0f and 1.0f.
    The fraction of available memory to be used for buffering rows in 
Reducesink operator for limit pushdown optimization.
  
{code}
Based on the description of hive-default.xml.template, 
hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
memory of all buffered lines for the limit pushdown optimization, and 
successfully start hiveserver2.

 

Then, call the java api to write a program to establish a jdbc connection as a 
client to access hive, using JDBCDemo as an example.

    hive.limit.pushdown.memory.usage
    0.1
    
      Expects value between 0.0f and 1.0f.
      The fraction of available memory to be used for buffering rows in 
Reducesink operator for limit pushdown optimization.
    
  
Based on the description of hive-default.xml.template, 
hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
memory of all buffered lines for the limit pushdown optimization, and 
successfully start hiveserver2.

Then, call the java api to write a program to establish a jdbc connection as a 
client to access hive, using JDBCDemo as an example.
import demo.utils.JDBCUtils;
public class JDBCDemo{
public static void main(String[] args) throws Exception {
    JDBCUtils.init();
    JDBCUtils.createDatabase();
    JDBCUtils.showDatabases();
    JDBCUtils.createTable();
    JDBCUtils.showTables();
    JDBCUtils.descTable();
    JDBCUtils.loadData();
    JDBCUtils.selectData();
    JDBCUtils.countData();
    JDBCUtils.dropDatabase();
    JDBCUtils.dropTable();
    JDBCUtils.destory();
}
}
After running the client program, both the client and the hiveserver throw 
exceptions.
{code:java}
2022-05-09 19:05:36: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 67a6db8d-f957-4d5d-ac18-28403adab7f3
Hive Session ID = f9f8772c-5765-4c3e-bcff-ca605c667be7
OK
OK
OK
OK
OK
OK
OK
Loading data to table default.emp
OK
FAILED: SemanticException Invalid memory usage value 1.0 for 
hive.limit.pushdown.memory.usage{code}
{code:java}
liky@ljq1:~/hive_jdbc_test$ ./startJDBC_0.sh 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/liky/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.1/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/liky/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Running: drop database if exists hive_jdbc_test
Running: create database hive_jdbc_test
Running: show databases
default
hive_jdbc_test
Running: drop table if exists emp
Running: create table emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
row format delimited fields terminated by '\t'
Running: show tables
emp
Running: desc emp
empno   int
ename   string
job     string
mgr     int
hiredate       string
sal     double
comm   double
deptno int
Running: load data local inpath '/home/liky/hiveJDBCTestData/data.txt' 
overwrite into table emp
Running: select * from emp
Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: Error 
while compiling statement: FAILED: SemanticException Invalid memory usage value 
1.0 for hive.limit.pushdown.memory.usage
      at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:380)
      at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:366)
      at 
org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:354)
      at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:293)
      at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:509)
      at demo.utils.JDBCUtils.selectData(JDBCUtils.java:98)
      at demo.test.JDBCDemo.main(JDBCDemo.java:19){code}
Setting hive.limit.pushdown.memory.usage to 0.0 has no exception.

So, setting hive.limit.pushdown.memory.usage to 1.0 is 

[jira] [Updated] (HIVE-26213) "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it will raise an error

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26213:
---
Description: 
In hive-default.xml.template

    hive.limit.pushdown.memory.usage
    0.1
    
      Expects value between 0.0f and 1.0f.
      The fraction of available memory to be used for buffering rows in 
Reducesink operator for limit pushdown optimization.
    
  
Based on the description of hive-default.xml.template, 
hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
memory of all buffered lines for the limit pushdown optimization, and 
successfully start hiveserver2.

Then, call the java api to write a program to establish a jdbc connection as a 
client to access hive, using JDBCDemo as an example.
import demo.utils.JDBCUtils;
public class JDBCDemo{
public static void main(String[] args) throws Exception {
    JDBCUtils.init();
    JDBCUtils.createDatabase();
    JDBCUtils.showDatabases();
    JDBCUtils.createTable();
    JDBCUtils.showTables();
    JDBCUtils.descTable();
    JDBCUtils.loadData();
    JDBCUtils.selectData();
    JDBCUtils.countData();
    JDBCUtils.dropDatabase();
    JDBCUtils.dropTable();
    JDBCUtils.destory();
}
}
After running the client program, both the client and the hiveserver throw 
exceptions.
2022-05-09 19:05:36: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 67a6db8d-f957-4d5d-ac18-28403adab7f3
Hive Session ID = f9f8772c-5765-4c3e-bcff-ca605c667be7
OK
OK
OK
OK
OK
OK
OK
Loading data to table default.emp
OK
FAILED: SemanticException Invalid memory usage value 1.0 for 
hive.limit.pushdown.memory.usage
liky@ljq1:~/hive_jdbc_test$ ./startJDBC_0.sh 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/liky/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.1/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/liky/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Running: drop database if exists hive_jdbc_test
Running: create database hive_jdbc_test
Running: show databases
default
hive_jdbc_test
Running: drop table if exists emp
Running: create table emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
row format delimited fields terminated by '\t'
Running: show tables
emp
Running: desc emp
empno   int
ename   string
job     string
mgr     int
hiredate        string
sal     double
comm    double
deptno  int
Running: load data local inpath '/home/liky/hiveJDBCTestData/data.txt' 
overwrite into table emp
Running: select * from emp
Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: Error 
while compiling statement: FAILED: SemanticException Invalid memory usage value 
1.0 for hive.limit.pushdown.memory.usage
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:380)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:366)
        at 
org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:354)
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:293)
        at 
org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:509)
        at demo.utils.JDBCUtils.selectData(JDBCUtils.java:98)
        at demo.test.JDBCDemo.main(JDBCDemo.java:19)
Setting hive.limit.pushdown.memory.usage to 0.0 has no exception.

So, setting hive.limit.pushdown.memory.usage to 1.0 is not desirable, 
*hive-default.xml.template is not clear enough for the description of the 
boundary of the value, it is better to use the interval to indicate the value 
that is [0.0,1.0).*

  was:
In hive-default.xml.template

 

 

 

 
{code:java}
 hive.limit.pushdown.memory.usage 0.1 
 Expects value between 0.0f and 1.0f. The fraction of available 
memory to be used for buffering rows in Reducesink operator for limit pushdown 
optimization.  
{code}
 

 

Based on the description of hive-default.xml.template, 
hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
memory of all buffered 

[jira] [Assigned] (HIVE-26213) "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it will raise an error

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu reassigned HIVE-26213:
--


> "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it 
> will raise an error
> ---
>
> Key: HIVE-26213
> URL: https://issues.apache.org/jira/browse/HIVE-26213
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
> Environment: Hive 3.1.2
> os.name=Linux
> os.arch=amd64
> os.version=5.4.0-72-generic
> java.version=1.8.0_162
> java.vendor=Oracle Corporation
>Reporter: Jingxuan Fu
>Assignee: Jingxuan Fu
>Priority: Major
>
> In hive-default.xml.template
>  
>  
>  
>  
> {code:java}
>  hive.limit.pushdown.memory.usage 0.1 
>  Expects value between 0.0f and 1.0f. The fraction of available 
> memory to be used for buffering rows in Reducesink operator for limit 
> pushdown optimization.  
> {code}
>  
>  
> Based on the description of hive-default.xml.template, 
> hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting 
> hive.limit.pushdown.memory.usage to 1.0 means that it expects the available 
> memory of all buffered lines for the limit pushdown optimization, and 
> successfully start hiveserver2.
> Then, call the java api to write a program to establish a jdbc connection as 
> a client to access hive, using JDBCDemo as an example.
>  
> {code:java}
> import demo.utils.JDBCUtils; public class JDBCDemo{ public static void 
> main(String[] args) throws Exception {  JDBCUtils.init();  
> JDBCUtils.createDatabase();  JDBCUtils.showDatabases();  
> JDBCUtils.createTable();  JDBCUtils.showTables();  JDBCUtils.descTable();  
> JDBCUtils.loadData();  JDBCUtils.selectData();  JDBCUtils.countData();  
> JDBCUtils.dropDatabase();  JDBCUtils.dropTable();  JDBCUtils.destory(); } }
> {code}
> After running the client program, both the client and the hiveserver throw 
> exceptions.
>  
> {code:java}
> 2022-05-09 19:05:36: Starting HiveServer2 SLF4J: Class path contains multiple 
> SLF4J bindings. SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = 
> 67a6db8d-f957-4d5d-ac18-28403adab7f3 Hive Session ID = 
> f9f8772c-5765-4c3e-bcff-ca605c667be7 OK OK OK OK OK OK OK Loading data to 
> table default.emp OK FAILED: SemanticException Invalid memory usage value 1.0 
> for hive.limit.pushdown.memory.usage{code}
>  
>  
>  
> {code:java}
> liky@ljq1:~/hive_jdbc_test$ ./startJDBC_0.sh  SLF4J: Class path contains 
> multiple SLF4J bindings. SLF4J: Found binding in 
> [jar:file:/home/liky/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.1/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  SLF4J: Found binding in 
> [jar:file:/home/liky/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] Running: drop database if 
> exists hive_jdbc_test Running: create database hive_jdbc_test Running: show 
> databases default hive_jdbc_test Running: drop table if exists emp Running: 
> create table emp( empno int, ename string, job string, mgr int, hiredate 
> string, sal double, comm double, deptno int ) row format delimited fields 
> terminated by '\t' Running: show tables emp Running: desc emp empno int ename 
> string job string mgr int hiredate string sal double comm double deptno int 
> Running: load data local inpath '/home/liky/hiveJDBCTestData/data.txt' 
> overwrite into table emp Running: select * from emp Exception in thread 
> "main" org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: SemanticException Invalid memory usage value 1.0 for 
> hive.limit.pushdown.memory.usage  at 
> org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:380)  at 
> org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:366)  at 
> org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:354)  
> at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:293)  at 
> org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:509)  at 
> demo.utils.JDBCUtils.selectData(JDBCUtils.java:98)  at 
> demo.test.JDBCDemo.main(JDBCDemo.java:19){code}
>  
>  
> Setting hive.limit.pushdown.memory.usage 

[jira] [Updated] (HIVE-26211) "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26211:
---
Description: 
In hive-default.xml.template
{code:java}

  hive.server2.webui.max.historic.queries
  25
  The maximum number of past queries to show in HiverSever2 
WebUI.
{code}
Set hive.server2.webui.max.historic.queries to a relatively large value, take 
2000 as an example, start hiveserver2, it can start hiveserver normally, 
and logging without exception.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-05-09 20:03:41: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 0b419706-4026-4a8b-80fe-b79fecbccd4f
Hive Session ID = 0f9e28d7-0081-4b2f-a743-4093c38c152d{code}
Next, if you use beeline as a client to connect to hive and send a request for 
database related operations, for example, if you query all the databases, after 
successfully executing "show databases", beeline blocks and no other operations 
can be performed.
{code:java}
liky@ljq1:/opt/hive$ beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:hive2://192.168.1.194:1/default
Connecting to jdbc:hive2://192.168.1.194:1/default
Enter username for jdbc:hive2://192.168.1.194:1/default: hive
Enter password for jdbc:hive2://192.168.1.194:1/default: *
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.194:1/default> show databases
. . . . . . . . . . . . . . . . . . . . . .> ;
INFO : Compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from 
deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.393 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.109 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager


database_name


default      

1 row selected (1.374 seconds)
{code}
Also, on the hiveserver side, a runtime null pointer exception is thrown, and 
the observation log throws no warnings or errors.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Error: Could not find or load main class 
org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 

[jira] [Updated] (HIVE-26211) "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26211:
---
Description: 
In hive-default.xml.template
{code:java}

  hive.server2.webui.max.historic.queries
  25
  The maximum number of past queries to show in HiverSever2 
WebUI.
{code}
Set hive.server2.webui.max.historic.queries to a relatively large value, take 
2000 as an example, start hiveserver2, it can start hiveserver normally, 
and logging without exception.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-05-09 20:03:41: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 0b419706-4026-4a8b-80fe-b79fecbccd4f
Hive Session ID = 0f9e28d7-0081-4b2f-a743-4093c38c152d{code}
Next, if you use beeline as a client to connect to hive and send a request for 
database related operations, for example, if you query all the databases, after 
successfully executing "show databases", beeline blocks and no other operations 
can be performed.
{code:java}
liky@ljq1:/opt/hive$ beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:hive2://192.168.1.194:1/default
Connecting to jdbc:hive2://192.168.1.194:1/default
Enter username for jdbc:hive2://192.168.1.194:1/default: hive
Enter password for jdbc:hive2://192.168.1.194:1/default: *
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.194:1/default> show databases
. . . . . . . . . . . . . . . . . . . . . .> ;
INFO : Compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from 
deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.393 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.109 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager


database_name


default      

1 row selected (1.374 seconds)
{code}
Also, on the hiveserver side, a runtime null pointer exception is thrown, and 
the observation log throws no warnings or errors.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Error: Could not find or load main class 
org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 

[jira] [Updated] (HIVE-26211) "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26211:
---
Description: 
In hive-default.xml.template
{code:java}

  hive.server2.webui.max.historic.queries
  25
  The maximum number of past queries to show in HiverSever2 
WebUI.
{code}
Set hive.server2.webui.max.historic.queries to a relatively large value, take 
2000 as an example, start hiveserver2, it can start hiveserver normally, 
and logging without exception.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-05-09 20:03:41: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 0b419706-4026-4a8b-80fe-b79fecbccd4f
Hive Session ID = 0f9e28d7-0081-4b2f-a743-4093c38c152d{code}
 

Next, if you use beeline as a client to connect to hive and send a request for 
database related operations, for example, if you query all the databases, after 
successfully executing "show databases", beeline blocks and no other operations 
can be performed.
{code:java}
liky@ljq1:/opt/hive$ beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:hive2://192.168.1.194:1/default
Connecting to jdbc:hive2://192.168.1.194:1/default
Enter username for jdbc:hive2://192.168.1.194:1/default: hive
Enter password for jdbc:hive2://192.168.1.194:1/default: *
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.194:1/default> show databases
. . . . . . . . . . . . . . . . . . . . . .> ;
INFO : Compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from 
deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.393 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.109 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager


database_name


default      

1 row selected (1.374 seconds)
{code}
Also, on the hiveserver side, a runtime null pointer exception is thrown, and 
the observation log throws no warnings or errors.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Error: Could not find or load main class 
org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 

[jira] [Updated] (HIVE-26212) hive fetch data timeout

2022-05-09 Thread royal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

royal updated HIVE-26212:
-
Description: 
When i fetch data from hive, The following error message appears, I think it's 
related to the size of the data

 

2022-05-09 19:28:17,156 INFO org.apache.hadoop.mapred.FileInputFormat: 
[HiveServer2-Handler-Pool: Thread-773525]: Total input paths to process : 47751
2022-05-09 19:30:19,729 WARN org.apache.hadoop.hive.conf.HiveConf: 
[HiveServer2-Handler-Pool: Thread-773521]: HiveConf of name 
hive.server2.idle.session.timeout_check_operation does not exist
2022-05-09 19:30:19,729 WARN org.apache.hadoop.hive.conf.HiveConf: 
[HiveServer2-Handler-Pool: Thread-773521]: HiveConf of name 
hive.sentry.conf.url does not exist
2022-05-09 19:30:19,729 WARN org.apache.hadoop.hive.conf.HiveConf: 
[HiveServer2-Handler-Pool: Thread-773521]: HiveConf of name 
hive.entity.capture.input.URI does not exist
2022-05-09 19:30:19,733 INFO org.apache.hadoop.hive.ql.exec.ListSinkOperator: 
[HiveServer2-Handler-Pool: Thread-773521]: 749375 finished. closing...
2022-05-09 19:30:19,733 INFO org.apache.hadoop.hive.ql.exec.ListSinkOperator: 
[HiveServer2-Handler-Pool: Thread-773521]: 749375 Close done
2022-05-09 19:30:19,733 INFO org.apache.hadoop.hive.ql.exec.ListSinkOperator: 
[HiveServer2-Handler-Pool: Thread-773521]: Initializing Self OP[749375]
2022-05-09 19:30:19,733 INFO org.apache.hadoop.hive.ql.exec.ListSinkOperator: 
[HiveServer2-Handler-Pool: Thread-773521]: Operator 749375 OP initialized
2022-05-09 19:30:19,733 INFO org.apache.hadoop.hive.ql.exec.ListSinkOperator: 
[HiveServer2-Handler-Pool: Thread-773521]: Initialization Done 749375 OP
2022-05-09 19:30:19,741 WARN 
org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: 
Thread-773525]: Error fetching results:
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
java.lang.NullPointerException
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:463)
at 
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:294)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:769)
at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:462)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:694)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:706)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2071)
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:458)
... 13 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextPath(FetchOperator.java:255)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:350)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:295)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:446)
... 17 more
Reply
 
0 KUDOS
 

> hive fetch data timeout
> ---
>
> Key: HIVE-26212
> URL: https://issues.apache.org/jira/browse/HIVE-26212
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: royal
>Priority: Major
>
> When i fetch data from hive, The following error message appears, I think 
> it's related to the size of the data
>  
> 2022-05-09 19:28:17,156 INFO org.apache.hadoop.mapred.FileInputFormat: 
> [HiveServer2-Handler-Pool: Thread-773525]: Total input paths to process : 
> 47751
> 2022-05-09 19:30:19,729 WARN org.apache.hadoop.hive.conf.HiveConf: 
> [HiveServer2-Handler-Pool: Thread-773521]: HiveConf of name 
> hive.server2.idle.session.timeout_check_operation does not exist
> 2022-05-09 19:30:19,729 WARN 

[jira] [Updated] (HIVE-26211) "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu updated HIVE-26211:
---
Description: 
In hive-default.xml.template
{code:java}

  hive.server2.webui.max.historic.queries
  25
  The maximum number of past queries to show in HiverSever2 
WebUI.
{code}
Set hive.server2.webui.max.historic.queries to a relatively large value, take 
2000 as an example, start hiveserver2, it can start hiveserver normally, 
and logging without exception.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-05-09 20:03:41: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 0b419706-4026-4a8b-80fe-b79fecbccd4f
Hive Session ID = 0f9e28d7-0081-4b2f-a743-4093c38c152d{code}
 

 

Next, if you use beeline as a client to connect to hive and send a request for 
database related operations, for example, if you query all the databases, after 
successfully executing "show databases", beeline blocks and no other operations 
can be performed.
{code:java}
liky@ljq1:/opt/hive$ beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:hive2://192.168.1.194:1/default
Connecting to jdbc:hive2://192.168.1.194:1/default
Enter username for jdbc:hive2://192.168.1.194:1/default: hive
Enter password for jdbc:hive2://192.168.1.194:1/default: *
Connected to: Apache Hive (version 3.1.2)
Driver: Hive JDBC (version 3.1.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://192.168.1.194:1/default> show databases
. . . . . . . . . . . . . . . . . . . . . .> ;
INFO : Compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from 
deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.393 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show 
databases
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time 
taken: 0.109 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager


database_name


default      

1 row selected (1.374 seconds)
{code}
Also, on the hiveserver side, a runtime null pointer exception is thrown, and 
the observation log throws no warnings or errors.
{code:java}
liky@ljq1:/usr/local/hive/conf$ hiveserver2 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Error: Could not find or load main class 
org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 

[jira] [Assigned] (HIVE-26211) "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking

2022-05-09 Thread Jingxuan Fu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingxuan Fu reassigned HIVE-26211:
--


> "hive.server2.webui.max.historic.queries" should be avoided to be set too 
> large, otherwise it will cause blocking
> -
>
> Key: HIVE-26211
> URL: https://issues.apache.org/jira/browse/HIVE-26211
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
> Environment: Hive 3.1.2
> os.name=Linux
> os.arch=amd64
> os.version=5.4.0-72-generic
> java.version=1.8.0_162
> java.vendor=Oracle Corporation
>Reporter: Jingxuan Fu
>Assignee: Jingxuan Fu
>Priority: Major
>
> In hive-default.xml.template
> 
>     hive.server2.webui.max.historic.queries
>     25
>     The maximum number of past queries to show in HiverSever2 
> WebUI.
>   
> Set hive.server2.webui.max.historic.queries to a relatively large value, take 
> 2000 as an example, start hiveserver2, it can start hiveserver normally, 
> and logging without exception.
> liky@ljq1:/usr/local/hive/conf$ hiveserver2 
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2022-05-09 20:03:41: Starting HiveServer2
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Hive Session ID = 0b419706-4026-4a8b-80fe-b79fecbccd4f
> Hive Session ID = 0f9e28d7-0081-4b2f-a743-4093c38c152d
> Next, if you use beeline as a client to connect to hive and send a request 
> for database related operations, for example, if you query all the databases, 
> after successfully executing "show databases", beeline blocks and no other 
> operations can be performed.
> liky@ljq1:/opt/hive$ beeline
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Beeline version 3.1.2 by Apache Hive
> beeline> !connect jdbc:hive2://192.168.1.194:1/default
> Connecting to jdbc:hive2://192.168.1.194:1/default
> Enter username for jdbc:hive2://192.168.1.194:1/default: hive
> Enter password for jdbc:hive2://192.168.1.194:1/default: *
> Connected to: Apache Hive (version 3.1.2)
> Driver: Hive JDBC (version 3.1.2)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://192.168.1.194:1/default> show databases
> . . . . . . . . . . . . . . . . . . . . . .> ;
> INFO  : Compiling 
> command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): 
> show databases
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, 
> comment:from deserializer)], properties:null)
> INFO  : Completed compiling 
> command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); 
> Time taken: 0.393 seconds
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Executing 
> command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): 
> show databases
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); 
> Time taken: 0.109 seconds
> INFO  : OK
> INFO  : Concurrency mode is disabled, not creating a lock manager
> ++
> | database_name  |
> ++
> | default        |
> ++
> 1 row selected (1.374 seconds)
> Also, on the hiveserver side, a runtime null pointer exception is thrown, and 
> the 

[jira] [Work logged] (HIVE-26205) Remove the incorrect org.slf4j dependency in kafka-handler

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26205?focusedWorklogId=767877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767877
 ]

ASF GitHub Bot logged work on HIVE-26205:
-

Author: ASF GitHub Bot
Created on: 09/May/22 11:55
Start Date: 09/May/22 11:55
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on PR #3272:
URL: https://github.com/apache/hive/pull/3272#issuecomment-1121002974

   @pvary @deniskuzZ : Could you please review this PR?
   
   Project compile failed in my new machine with `maven 3.8.5`, and I think 
this dependency is redundant, which can be inherited from the parent pom.




Issue Time Tracking
---

Worklog Id: (was: 767877)
Time Spent: 20m  (was: 10m)

> Remove the incorrect org.slf4j dependency in kafka-handler
> --
>
> Key: HIVE-26205
> URL: https://issues.apache.org/jira/browse/HIVE-26205
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Get a compile error while executing:
> {code:bash}
> mvn clean install -DskipTests
> {code}
> The error message is:
> {code:bash}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile 
> (default-compile) on project kafka-handler: Compilation failure: Compilation 
> failure: 
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[53,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[54,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[73,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaStorageHandler
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[37,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[47,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class 
> org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaJsonSerDe.java:[63,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[35,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[50,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.SimpleKafkaWriter
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[34,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[43,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaOutputFormat
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[24,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[34,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.RetryUtils
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[51,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[65,24]
>  cannot find symbol
> [ERROR]   symbol:   class Logger
> [ERROR]   location: class org.apache.hadoop.hive.kafka.KafkaScanTrimmer
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[45,17]
>  package org.slf4j does not exist
> [ERROR] 
> /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[65,24]
>  

[jira] [Work logged] (HIVE-26203) Implement alter iceberg table metadata location

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26203?focusedWorklogId=767868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767868
 ]

ASF GitHub Bot logged work on HIVE-26203:
-

Author: ASF GitHub Bot
Created on: 09/May/22 11:18
Start Date: 09/May/22 11:18
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3270:
URL: https://github.com/apache/hive/pull/3270#discussion_r867902283


##
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java:
##
@@ -1456,6 +1456,63 @@ public void testCreateTableWithMetadataLocation() throws 
IOException {
 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS.stream()).collect(Collectors.toList()),
 records, 0);
   }
 
+  @Test
+  public void testAlterTableWithMetadataLocation() throws IOException {
+Assume.assumeTrue("Alter table with metadata location is only supported 
for Hive Catalog tables",

Review Comment:
   What do the users see when they try to do this for a non-HiveCatalog table?





Issue Time Tracking
---

Worklog Id: (was: 767868)
Time Spent: 50m  (was: 40m)

> Implement alter iceberg table metadata location
> ---
>
> Key: HIVE-26203
> URL: https://issues.apache.org/jira/browse/HIVE-26203
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: iceberg, pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26203) Implement alter iceberg table metadata location

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26203?focusedWorklogId=767867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767867
 ]

ASF GitHub Bot logged work on HIVE-26203:
-

Author: ASF GitHub Bot
Created on: 09/May/22 11:17
Start Date: 09/May/22 11:17
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3270:
URL: https://github.com/apache/hive/pull/3270#discussion_r867902283


##
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java:
##
@@ -1456,6 +1456,63 @@ public void testCreateTableWithMetadataLocation() throws 
IOException {
 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS.stream()).collect(Collectors.toList()),
 records, 0);
   }
 
+  @Test
+  public void testAlterTableWithMetadataLocation() throws IOException {
+Assume.assumeTrue("Alter table with metadata location is only supported 
for Hive Catalog tables",

Review Comment:
   What happens in the code where we try to do this for a non-HiveCatalog table?





Issue Time Tracking
---

Worklog Id: (was: 767867)
Time Spent: 40m  (was: 0.5h)

> Implement alter iceberg table metadata location
> ---
>
> Key: HIVE-26203
> URL: https://issues.apache.org/jira/browse/HIVE-26203
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: iceberg, pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26203) Implement alter iceberg table metadata location

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26203?focusedWorklogId=767866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767866
 ]

ASF GitHub Bot logged work on HIVE-26203:
-

Author: ASF GitHub Bot
Created on: 09/May/22 11:16
Start Date: 09/May/22 11:16
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3270:
URL: https://github.com/apache/hive/pull/3270#discussion_r867901386


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:
##
@@ -336,6 +336,30 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   // that users can change data types or reorder columns too with this 
alter op type, so its name is misleading..)
   assertNotMigratedTable(hmsTable.getParameters(), "CHANGE COLUMN");
   handleChangeColumn(hmsTable);
+} else if (AlterTableType.ADDPROPS.equals(currentAlterTableOp)) {
+  assertNotThirdPartyMetadataLocationChange(hmsTable.getParameters());
+}
+  }
+
+  /**
+   * Perform a check on the current iceberg table whether a metadata change 
can be performed. A table is eligible if
+   * the current metadata uuid and the new metadata uuid matches.
+   * @param tblParams hms table properties, must be non-null
+   */
+  private void assertNotThirdPartyMetadataLocationChange(Map 
tblParams) {
+if 
(tblParams.containsKey(BaseMetastoreTableOperations.METADATA_LOCATION_PROP)) {
+  Preconditions.checkArgument(icebergTable != null,

Review Comment:
   How could this happen?





Issue Time Tracking
---

Worklog Id: (was: 767866)
Time Spent: 0.5h  (was: 20m)

> Implement alter iceberg table metadata location
> ---
>
> Key: HIVE-26203
> URL: https://issues.apache.org/jira/browse/HIVE-26203
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: iceberg, pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26203) Implement alter iceberg table metadata location

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26203?focusedWorklogId=767865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767865
 ]

ASF GitHub Bot logged work on HIVE-26203:
-

Author: ASF GitHub Bot
Created on: 09/May/22 11:14
Start Date: 09/May/22 11:14
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3270:
URL: https://github.com/apache/hive/pull/3270#discussion_r867899800


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:
##
@@ -336,6 +336,30 @@ public void 
preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
   // that users can change data types or reorder columns too with this 
alter op type, so its name is misleading..)
   assertNotMigratedTable(hmsTable.getParameters(), "CHANGE COLUMN");
   handleChangeColumn(hmsTable);
+} else if (AlterTableType.ADDPROPS.equals(currentAlterTableOp)) {
+  assertNotThirdPartyMetadataLocationChange(hmsTable.getParameters());

Review Comment:
   nit: `crossTableMetadataLocationChange`?





Issue Time Tracking
---

Worklog Id: (was: 767865)
Time Spent: 20m  (was: 10m)

> Implement alter iceberg table metadata location
> ---
>
> Key: HIVE-26203
> URL: https://issues.apache.org/jira/browse/HIVE-26203
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: iceberg, pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26177) Create a new connection pool for compaction (DataNucleus)

2022-05-09 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits reassigned HIVE-26177:
--

Assignee: Antal Sinkovits

> Create a new connection pool for compaction (DataNucleus)
> -
>
> Key: HIVE-26177
> URL: https://issues.apache.org/jira/browse/HIVE-26177
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-26177) Create a new connection pool for compaction (DataNucleus)

2022-05-09 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits resolved HIVE-26177.

Resolution: Fixed

Pushed to master. Thanks for the review [~dkuzmenko]

> Create a new connection pool for compaction (DataNucleus)
> -
>
> Key: HIVE-26177
> URL: https://issues.apache.org/jira/browse/HIVE-26177
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26177) Create a new connection pool for compaction (DataNucleus)

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26177?focusedWorklogId=767819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767819
 ]

ASF GitHub Bot logged work on HIVE-26177:
-

Author: ASF GitHub Bot
Created on: 09/May/22 07:49
Start Date: 09/May/22 07:49
Worklog Time Spent: 10m 
  Work Description: asinkovits merged PR #3265:
URL: https://github.com/apache/hive/pull/3265




Issue Time Tracking
---

Worklog Id: (was: 767819)
Time Spent: 20m  (was: 10m)

> Create a new connection pool for compaction (DataNucleus)
> -
>
> Key: HIVE-26177
> URL: https://issues.apache.org/jira/browse/HIVE-26177
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26210) Fix tests for Cleaner failed attempt threshold

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26210:
--
Labels: pull-request-available  (was: )

> Fix tests for Cleaner failed attempt threshold
> --
>
> Key: HIVE-26210
> URL: https://issues.apache.org/jira/browse/HIVE-26210
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26210) Fix tests for Cleaner failed attempt threshold

2022-05-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26210?focusedWorklogId=767813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-767813
 ]

ASF GitHub Bot logged work on HIVE-26210:
-

Author: ASF GitHub Bot
Created on: 09/May/22 07:25
Start Date: 09/May/22 07:25
Worklog Time Spent: 10m 
  Work Description: veghlaci05 opened a new pull request, #3274:
URL: https://github.com/apache/hive/pull/3274

   ### What changes were proposed in this pull request?
   This PR fixes the flaky tests created for HIVE-25943.
   
   
   ### Why are the changes needed?
   The test introduced in HIVE-25943 were flaky.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Tested manually.




Issue Time Tracking
---

Worklog Id: (was: 767813)
Remaining Estimate: 0h
Time Spent: 10m

> Fix tests for Cleaner failed attempt threshold
> --
>
> Key: HIVE-26210
> URL: https://issues.apache.org/jira/browse/HIVE-26210
> Project: Hive
>  Issue Type: Bug
>Reporter: László Végh
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-12336) Sort Merge Partition Map Join

2022-05-09 Thread Prasad Gawande (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Gawande updated HIVE-12336:
--
Description: 
Logically and functionally bucketing and partitioning are quite similar - both 
provide mechanism to segregate and separate the table's data based on its 
content. Thanks to that significant further optimisations like [partition] 
PRUNING or [bucket] MAP JOIN are possible.
The difference seems to be imposed by design where the PARTITIONing is 
open/explicit while BUCKETing is discrete/implicit.
Partitioning seems to be very common if not a standard feature in all current 
RDBMS while BUCKETING seems to be HIVE specific only.
In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
PARTITIONING".

Regardless of the fact that these two are recognised as two separate features 
available in Hive there should be nothing to prevent leveraging same existing 
query/join optimisations across the two.

PARTITION SORT MERGE MAPJOIN
Use the same type of optimization as in SORT MERGE BUCKETED MAP JOIN for 
partitioned tables.
The sort-merge join optimization could be performed when PARTITIONED tables 
being joined are sorted and partitioned on the join columns.

The corresponding partitions are joined with each other at the mapper. If both 
A and B have partitions set on their columns KEY, the following join
SELECT /*+ MAPJOIN(b) */ a.key, a.value
FROM A a JOIN B b ON a.key = b.key
can be done on the mapper only. The mapper for the partition key='201512' for A 
will traverse the corresponding partition for B. Traversing is possible if the 
corresponding partitions are sorted on the same columns. This is dependent on 
(taken care by HIVE-11525)

  was:
Logically and functionally bucketing and partitioning are quite similar - both 
provide mechanism to segregate and separate the table's data based on its 
content. Thanks to that significant further optimisations like [partition] 
PRUNING or [bucket] MAP JOIN are possible.
The difference seems to be imposed by design where the PARTITIONing is 
open/explicit while BUCKETing is discrete/implicit.
Partitioning seems to be very common if not a standard feature in all current 
RDBMS while BUCKETING seems to be HIVE specific only.
In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
PARTITIONING".

Regardless of the fact that these two are recognised as two separate features 
available in Hive there should be nothing to prevent leveraging same existing 
query/join optimisations across the two.

PARTITION SORT MERGE MAPJOIN
Use the same type of optimization as in SORT MERGE BUCKETED MAP JOIN for 
partitioned tables.
The sort-merge join optimization could be performed when PARTITIONED tables 
being joined are sorted and partitioned on the join columns.

The corresponding partitions are joined with each other at the mapper. If both 
A and B have partitions set on their columns KEY, the following join
SELECT /*+ MAPJOIN(b) */ a.key, a.value
FROM A a JOIN B b ON a.key = b.key
can be done on the mapper only. The mapper for the partition key='201512' for A 
will traverse the corresponding partition for B. Traversing is possible if the 
corresponding partitions are sorted on the same columns. This is dependent on 
(taken care by [HIVE-11525|https://issues.apache.org/jira/browse/HIVE-12337])


> Sort Merge Partition Map Join
> -
>
> Key: HIVE-12336
> URL: https://issues.apache.org/jira/browse/HIVE-12336
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer, Physical Optimizer, SQL
>Affects Versions: 0.13.0, 0.13.1, 0.14.0, 1.0.0, 1.1.0
>Reporter: Maciek Kocon
>Priority: Major
>  Labels: gsoc2015
>
> Logically and functionally bucketing and partitioning are quite similar - 
> both provide mechanism to segregate and separate the table's data based on 
> its content. Thanks to that significant further optimisations like 
> [partition] PRUNING or [bucket] MAP JOIN are possible.
> The difference seems to be imposed by design where the PARTITIONing is 
> open/explicit while BUCKETing is discrete/implicit.
> Partitioning seems to be very common if not a standard feature in all current 
> RDBMS while BUCKETING seems to be HIVE specific only.
> In a way BUCKETING could be also called by "hashing" or simply "IMPLICIT 
> PARTITIONING".
> Regardless of the fact that these two are recognised as two separate features 
> available in Hive there should be nothing to prevent leveraging same existing 
> query/join optimisations across the two.
> PARTITION SORT MERGE MAPJOIN
> Use the same type of optimization as in SORT MERGE BUCKETED MAP JOIN for 
> partitioned tables.
> The sort-merge join optimization could be performed when PARTITIONED tables 
> being joined are sorted and partitioned on the join