[jira] [Commented] (HIVE-25931) yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on tez的在切片时失败,导致hive任务无法运行

2022-02-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487911#comment-17487911
 ] 

László Bodor commented on HIVE-25931:
-

english please

> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行
> -
>
> Key: HIVE-25931
> URL: https://issues.apache.org/jira/browse/HIVE-25931
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1
>Reporter: lkl
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.1.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25814) Add entry in replication_metrics table for skipped replication iterations.

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25814?focusedWorklogId=721753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721753
 ]

ASF GitHub Bot logged work on HIVE-25814:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 06:50
Start Date: 07/Feb/22 06:50
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on pull request #2907:
URL: https://github.com/apache/hive/pull/2907#issuecomment-1031129619


   > This doesn't seems to cover the load iterations skipped? A load iteration 
can also be skipped, if there is no dump or if the previous load is in progress 
or load failed with some error.
   
   I've covered that case in this change only. Can you please go through 
TestReplicationScenariosAcidTables#testReplicationMetricForSkippedIteration?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721753)
Time Spent: 0.5h  (was: 20m)

> Add entry in replication_metrics table for skipped replication iterations.
> --
>
> Key: HIVE-25814
> URL: https://issues.apache.org/jira/browse/HIVE-25814
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25814) Add entry in replication_metrics table for skipped replication iterations.

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25814?focusedWorklogId=721751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721751
 ]

ASF GitHub Bot logged work on HIVE-25814:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 06:46
Start Date: 07/Feb/22 06:46
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2907:
URL: https://github.com/apache/hive/pull/2907#discussion_r800354281



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/util/ReplUtils.java
##
@@ -474,4 +475,17 @@ public static void addLoggerTask(ReplLogger replLogger, 
List> tasks, Hiv
   DAGTraversal.traverse(tasks, new 
AddDependencyToLeaves(Collections.singletonList(task)));
 }
   }
+
+  public static void reportStatusInReplicationMetrics(String stageName, Status 
status, String errorLogPath,
+  HiveConf conf)
+  throws SemanticException {
+GenericMetricCollector metricCollector = new GenericMetricCollector(0, 
conf);

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721751)
Time Spent: 20m  (was: 10m)

> Add entry in replication_metrics table for skipped replication iterations.
> --
>
> Key: HIVE-25814
> URL: https://issues.apache.org/jira/browse/HIVE-25814
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25872) Skip tracking of alterDatabase events for replication specific property (repl.last.id).

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25872:
--
Labels: pull-request-available  (was: )

> Skip tracking of alterDatabase events for replication specific property 
> (repl.last.id).
> ---
>
> Key: HIVE-25872
> URL: https://issues.apache.org/jira/browse/HIVE-25872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25872) Skip tracking of alterDatabase events for replication specific property (repl.last.id).

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25872?focusedWorklogId=721738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721738
 ]

ASF GitHub Bot logged work on HIVE-25872:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 05:13
Start Date: 07/Feb/22 05:13
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on a change in pull request #2950:
URL: https://github.com/apache/hive/pull/2950#discussion_r800320118



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -1613,6 +1618,30 @@ public void alter_database(final String dbName, final 
Database newDB) throws TEx
 }
   }
 
+  private boolean isAlterReplSpecific(Database oldDb, Database newDb) {

Review comment:
   Requires a rebase. Seems there is some similar logic in 
``isReplicationEventIdUpdate`` see if you can refactor and reuse some part of it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721738)
Remaining Estimate: 0h
Time Spent: 10m

> Skip tracking of alterDatabase events for replication specific property 
> (repl.last.id).
> ---
>
> Key: HIVE-25872
> URL: https://issues.apache.org/jira/browse/HIVE-25872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25907) IOW Directory queries fails to write data to final path when query result cache is enabled

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25907?focusedWorklogId=721735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721735
 ]

ASF GitHub Bot logged work on HIVE-25907:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 04:47
Start Date: 07/Feb/22 04:47
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #2978:
URL: https://github.com/apache/hive/pull/2978#issuecomment-1031070688


   @kgyrtkirk - Could you please review the changes?
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721735)
Time Spent: 0.5h  (was: 20m)

> IOW Directory queries fails to write data to final path when query result 
> cache is enabled
> --
>
> Key: HIVE-25907
> URL: https://issues.apache.org/jira/browse/HIVE-25907
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> INSERT OVERWRITE DIRECTORY queries fails to write the data to the specified 
> directory location when query result cache is enabled.
> *Steps to reproduce*
> {code:java}
> 1. create a data file with the following data
> 1 abc 10.5
> 2 def 11.5
> 2. create table pointing to that data
> create external table iowd(strct struct)
> row format delimited
> fields terminated by '\t'
> collection items terminated by ' '
> location '';
> 3. run the following query
> set hive.query.results.cache.enabled=true;
> INSERT OVERWRITE DIRECTORY "" SELECT * FROM iowd;
> {code}
> After execution of the above query, It is expected that the destination 
> directory contains data from the table iowd, But due to HIVE-21386 it is not 
> happening anymore.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25931) yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on tez的在切片时失败,导致hive任务无法运行

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25931:
---
Description: yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
tez的在切片时失败,导致hive任务无法运行

> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行
> -
>
> Key: HIVE-25931
> URL: https://issues.apache.org/jira/browse/HIVE-25931
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1
>Reporter: lkl
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.1.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25931) yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on tez的在切片时失败,导致hive任务无法运行

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25931?focusedWorklogId=721734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721734
 ]

ASF GitHub Bot logged work on HIVE-25931:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 04:21
Start Date: 07/Feb/22 04:21
Worklog Time Spent: 10m 
  Work Description: lklong opened a new pull request #3003:
URL: https://github.com/apache/hive/pull/3003


   … on tez的在切片时失败,导致hive任务无法运行
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721734)
Remaining Estimate: 0h
Time Spent: 10m

> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行
> -
>
> Key: HIVE-25931
> URL: https://issues.apache.org/jira/browse/HIVE-25931
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1
>Reporter: lkl
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25931) yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on tez的在切片时失败,导致hive任务无法运行

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25931:
--
Labels: pull-request-available  (was: )

> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行
> -
>
> Key: HIVE-25931
> URL: https://issues.apache.org/jira/browse/HIVE-25931
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1
>Reporter: lkl
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.1.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread Fachuan Bai (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487861#comment-17487861
 ] 

Fachuan Bai commented on HIVE-25912:


I fixed the bug: !hive-bugs-001.png!!hive-bugs002.png!

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png, hive-bugs-001.png, hive-bugs002.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread Fachuan Bai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fachuan Bai updated HIVE-25912:
---
Attachment: (was: hive-bugs002.png)

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread Fachuan Bai (Jira)


[ https://issues.apache.org/jira/browse/HIVE-25912 ]


Fachuan Bai deleted comment on HIVE-25912:


was (Author: JIRAUSER284437):
I fixed the bug: !hive-bugs-001.png!!hive-bugs002.png!

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread Fachuan Bai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fachuan Bai updated HIVE-25912:
---
Attachment: (was: hive-bugs-001.png)

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread Fachuan Bai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fachuan Bai updated HIVE-25912:
---
Attachment: hive-bugs-001.png
hive-bugs002.png

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png, hive-bugs-001.png, hive-bugs002.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25583) Support parallel load for HastTables - Interfaces

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25583?focusedWorklogId=721732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721732
 ]

ASF GitHub Bot logged work on HIVE-25583:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 04:13
Start Date: 07/Feb/22 04:13
Worklog Time Spent: 10m 
  Work Description: ramesh0201 commented on pull request #2999:
URL: https://github.com/apache/hive/pull/2999#issuecomment-1031054518


   @pgaref Looks good to me. +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721732)
Time Spent: 40m  (was: 0.5h)

> Support parallel load for HastTables - Interfaces
> -
>
> Key: HIVE-25583
> URL: https://issues.apache.org/jira/browse/HIVE-25583
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Support parallel load for HastTables - Interfaces
> * Introducing VectorMapJoinFastHashTableContainerBase class that implements 
> VectorMapJoinHashTable
> * Each VectorMapJoinFastStringHashMapContainer is a singleton that contains 
> an array of HashTables (1 or more)
> * VectorMapJoinFastTableContainer now initializes 
> VectorMapJoinFastHashTableContainers instead of HTs directly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Description: current_timestamp 默认 utc 时区与系统时区不一致  (was: unix_timestamp 默认 
utc 时区与系统时区不一致)

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
> Fix For: 3.1.2
>
> Attachments: 1644206926(1).png
>
>
> current_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Priority: Major  (was: Critical)

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Major
> Fix For: 3.1.2
>
> Attachments: 1644206926(1).png
>
>
> current_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Fix Version/s: 3.1.2

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
> Fix For: 3.1.2
>
> Attachments: 1644206926(1).png
>
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
 External issue ID: HIVE-21039
External issue URL: https://issues.apache.org/jira/browse/HIVE-21039

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
> Attachments: 1644206926(1).png
>
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Attachment: 1644206926(1).png

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
> Attachments: 1644206926(1).png
>
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread Fachuan Bai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25912 started by Fachuan Bai.
--
> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Summary:  hive-3.1.1-current_timestamp时区差8小时修复代码  (was: unix_timestamp 默认 
utc 时区与系统时区不一致)

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) unix_timestamp 默认 utc 时区与系统时区不一致

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Summary: unix_timestamp 默认 utc 时区与系统时区不一致  (was: 
yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
tez的在切片时失败,导致hive任务无法运行)

> unix_timestamp 默认 utc 时区与系统时区不一致
> 
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on tez的在切片时失败,导致hive任务无法运行

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Summary: yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
tez的在切片时失败,导致hive任务无法运行  (was: unix_timestamp 默认 utc 时区与系统时区不一致)

> yarn上使用公平调度器模式下,配置多个队列。在队列被任务压满的情形下会出现,yarn返回总资源为-1,导致hive on 
> tez的在切片时失败,导致hive任务无法运行
> -
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?focusedWorklogId=721731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721731
 ]

ASF GitHub Bot logged work on HIVE-25912:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 04:02
Start Date: 07/Feb/22 04:02
Worklog Time Spent: 10m 
  Work Description: baifachuan edited a comment on pull request #2987:
URL: https://github.com/apache/hive/pull/2987#issuecomment-1031046556


   I fixed the bug with check the hive storage path, if it is a ROOT path throw 
an exception, broken the create table behavior.
   
   ```
   DEBUG : Shutting down query CREATE EXTERNAL TABLE `fcbai1`(
   `inv_item_sk` int,
   `inv_warehouse_sk` int,
   `inv_quantity_on_hand` int)
   PARTITIONED BY (
   `inv_date_sk` int) STORED AS ORC
   LOCATION
   'hdfs://127.0.0.1:8020/'
   Error: Error while processing statement: FAILED: Execution Error, return 
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
InvalidObjectException(message:fcbai1 location must not be root path) 
(state=08S01,code=1)
   ```
   
   Ensure that creating tables and dropping tables behave the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721731)
Remaining Estimate: 95h 10m  (was: 95h 20m)
Time Spent: 50m  (was: 40m)

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 50m
>  Remaining Estimate: 95h 10m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) unix_timestamp 默认 utc 时区与系统时区不一致

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Component/s: Query Planning

> unix_timestamp 默认 utc 时区与系统时区不一致
> 
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) unix_timestamp 默认 utc 时区与系统时区不一致

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Affects Version/s: 3.1.1

> unix_timestamp 默认 utc 时区与系统时区不一致
> 
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Critical
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?focusedWorklogId=721730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721730
 ]

ASF GitHub Bot logged work on HIVE-25912:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 03:57
Start Date: 07/Feb/22 03:57
Worklog Time Spent: 10m 
  Work Description: baifachuan commented on pull request #2987:
URL: https://github.com/apache/hive/pull/2987#issuecomment-1031046556


   I fixed the bug with check the hive storage path, if it is a ROOT path throw 
an exception, broken the create table.
   
   ```
   DEBUG : Shutting down query CREATE EXTERNAL TABLE `fcbai1`(
   `inv_item_sk` int,
   `inv_warehouse_sk` int,
   `inv_quantity_on_hand` int)
   PARTITIONED BY (
   `inv_date_sk` int) STORED AS ORC
   LOCATION
   'hdfs://127.0.0.1:8020/'
   Error: Error while processing statement: FAILED: Execution Error, return 
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
InvalidObjectException(message:fcbai1 location must not be root path) 
(state=08S01,code=1)
   ```
   
   Ensure that creating tables and dropping tables behave the same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721730)
Remaining Estimate: 95h 20m  (was: 95.5h)
Time Spent: 40m  (was: 0.5h)

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 40m
>  Remaining Estimate: 95h 20m
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25930) unix_timestamp 默认 utc 时区与系统时区不一致

2022-02-06 Thread lkl (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lkl updated HIVE-25930:
---
Description: unix_timestamp 默认 utc 时区与系统时区不一致

> unix_timestamp 默认 utc 时区与系统时区不一致
> 
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>Reporter: lkl
>Priority: Critical
>
> unix_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25912) Drop external table at root of s3 bucket throws NPE

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25912?focusedWorklogId=721727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721727
 ]

ASF GitHub Bot logged work on HIVE-25912:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 02:11
Start Date: 07/Feb/22 02:11
Worklog Time Spent: 10m 
  Work Description: baifachuan commented on pull request #2987:
URL: https://github.com/apache/hive/pull/2987#issuecomment-1031004175


   > 
   
   Yes,  I agree with it, but I think we should check the path throw a fail 
exception if the path is illegal,  create the table successfully but can not 
drop the table, this is not a good experience.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721727)
Remaining Estimate: 95.5h  (was: 95h 40m)
Time Spent: 0.5h  (was: 20m)

> Drop external table at root of s3 bucket throws NPE
> ---
>
> Key: HIVE-25912
> URL: https://issues.apache.org/jira/browse/HIVE-25912
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: Hive version: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: metastore, pull-request-available
> Attachments: hive bugs.png
>
>   Original Estimate: 96h
>  Time Spent: 0.5h
>  Remaining Estimate: 95.5h
>
> I create the external hive table using this command:
>  
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 'hdfs://emr-master-1:8020/';
> {code}
>  
> The table was created successfully, but  when I drop the table throw the NPE:
>  
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:java.lang.NullPointerException) 
> (state=08S01,code=1){code}
>  
> The same bug can reproduction on the other object storage file system, such 
> as S3 or TOS:
> {code:java}
> CREATE EXTERNAL TABLE `fcbai`(
> `inv_item_sk` int,
> `inv_warehouse_sk` int,
> `inv_quantity_on_hand` int)
> PARTITIONED BY (
> `inv_date_sk` int) STORED AS ORC
> LOCATION
> 's3a://bucketname/'; // 'tos://bucketname/'{code}
>  
> I see the source code found:
>  common/src/java/org/apache/hadoop/hive/common/FileUtils.java
> {code:java}
> // check if sticky bit is set on the parent dir
> FileStatus parStatus = fs.getFileStatus(path.getParent());
> if (!shims.hasStickyBit(parStatus.getPermission())) {
>   // no sticky bit, so write permission on parent dir is sufficient
>   // no further checks needed
>   return;
> }{code}
>  
> because I set the table location to HDFS root path 
> (hdfs://emr-master-1:8020/), so the  path.getParent() function will be return 
> null cause the NPE.
> I think have four solutions to fix the bug:
>  # modify the create table function, if the location is root dir return 
> create table fail.
>  # modify the  FileUtils.checkDeletePermission function, check the 
> path.getParent(), if it is null, the function return, drop successfully.
>  # modify the RangerHiveAuthorizer.checkPrivileges function of the hive 
> ranger plugin(in ranger rep), if the location is root dir return create table 
> fail.
>  # modify the HDFS Path object, if the URI is root dir, path.getParent() 
> return not null.
> I recommend the first or second method, any suggestion for me? thx.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25927) Fix DataWritableReadSupport

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25927?focusedWorklogId=721718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721718
 ]

ASF GitHub Bot logged work on HIVE-25927:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 00:43
Start Date: 07/Feb/22 00:43
Worklog Time Spent: 10m 
  Work Description: rbalamohan merged pull request #2998:
URL: https://github.com/apache/hive/pull/2998


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721718)
Time Spent: 0.5h  (was: 20m)

> Fix DataWritableReadSupport 
> 
>
> Key: HIVE-25927
> URL: https://issues.apache.org/jira/browse/HIVE-25927
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: Screenshot 2022-02-04 at 4.57.22 AM.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> !Screenshot 2022-02-04 at 4.57.22 AM.png|width=530,height=406!
> Takes n^2 ops to match columns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25927) Fix DataWritableReadSupport

2022-02-06 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan resolved HIVE-25927.
-
Fix Version/s: 4.0.0
 Assignee: Rajesh Balamohan
   Resolution: Fixed

Thanks [~pgaref] , [~maheshk114] for the review.

> Fix DataWritableReadSupport 
> 
>
> Key: HIVE-25927
> URL: https://issues.apache.org/jira/browse/HIVE-25927
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot 2022-02-04 at 4.57.22 AM.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> !Screenshot 2022-02-04 at 4.57.22 AM.png|width=530,height=406!
> Takes n^2 ops to match columns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25335) Unreasonable setting reduce number, when join big size table(but small row count) and small size table

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25335?focusedWorklogId=721717=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721717
 ]

ASF GitHub Bot logged work on HIVE-25335:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 00:12
Start Date: 07/Feb/22 00:12
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2490:
URL: https://github.com/apache/hive/pull/2490#issuecomment-1030948557


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721717)
Time Spent: 1h 40m  (was: 1.5h)

> Unreasonable setting reduce number, when join big size table(but small row 
> count) and small size table
> --
>
> Key: HIVE-25335
> URL: https://issues.apache.org/jira/browse/HIVE-25335
> Project: Hive
>  Issue Type: Improvement
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25335.001.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> I found an application which is slow in our cluster, because the proccess 
> bytes of one reduce is very huge, but only two reduce. 
> when I debug, I found the reason. Because in this sql, one big size table 
> (about 30G) with few row count(about 3.5M), another small size table (about 
> 100M) have more row count (about 3.6M). So JoinStatsRule.process only use 
> 100M to estimate reducer's number. But we need to  process 30G byte in fact.  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25789) Replication metrics and logs show wrong repl id when no of events replicated is 0

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25789?focusedWorklogId=721716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721716
 ]

ASF GitHub Bot logged work on HIVE-25789:
-

Author: ASF GitHub Bot
Created on: 07/Feb/22 00:12
Start Date: 07/Feb/22 00:12
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2854:
URL: https://github.com/apache/hive/pull/2854#issuecomment-1030948526


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721716)
Remaining Estimate: 0h
Time Spent: 10m

> Replication metrics and logs show wrong repl id when no of events replicated 
> is 0
> -
>
> Key: HIVE-25789
> URL: https://issues.apache.org/jira/browse/HIVE-25789
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When no of events replicated in incremental cycle is 0, logs and metrics show 
> wrong value of lastReplId. REPL STATUS command still gives the right value. 
> Logs show a value of 'null'.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25789) Replication metrics and logs show wrong repl id when no of events replicated is 0

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25789:
--
Labels: pull-request-available  (was: )

> Replication metrics and logs show wrong repl id when no of events replicated 
> is 0
> -
>
> Key: HIVE-25789
> URL: https://issues.apache.org/jira/browse/HIVE-25789
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When no of events replicated in incremental cycle is 0, logs and metrics show 
> wrong value of lastReplId. REPL STATUS command still gives the right value. 
> Logs show a value of 'null'.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-22224) Support Parquet-Avro Timestamp Type

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-4?focusedWorklogId=721676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721676
 ]

ASF GitHub Bot logged work on HIVE-4:
-

Author: ASF GitHub Bot
Created on: 06/Feb/22 18:03
Start Date: 06/Feb/22 18:03
Worklog Time Spent: 10m 
  Work Description: ferozed opened a new pull request #3002:
URL: https://github.com/apache/hive/pull/3002


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. Previously an external table created with logical-type of 
timestamp-millis would cause an exception. Now it should work.
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721676)
Time Spent: 40m  (was: 0.5h)

> Support Parquet-Avro Timestamp Type
> ---
>
> Key: HIVE-4
> URL: https://issues.apache.org/jira/browse/HIVE-4
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 2.3.5, 2.3.6
>Reporter: cdmikechen
>Assignee: cdmikechen
>Priority: Major
>  Labels: parquet, pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When user create an external table and import a parquet-avro data with 1.8.2 
> version which supported logical_type in Hive2.3 or before version, Hive can 
> not read timestamp type column data correctly.
> Hive will read it as LongWritable which it actually stores as 
> long(logical_type=timestamp-millis).So we may add some codes in 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableTimestampObjectInspector.java
>  to let Hive cast long type to timestamp type.
> Some code like below:
>  
> public Timestamp getPrimitiveJavaObject(Object o) {
>   if (o instanceof LongWritable) {
>     return new Timestamp(((LongWritable) o).get());
>   }
>   return o == null ? null : ((TimestampWritable) o).getTimestamp();
> }
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25929) Let secret config properties to be propagated to Tez

2022-02-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25929:

Description: 
History in chronological order:
HIVE-10508: removed some passwords from config that's propagated to execution 
engines
HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
hardcoded list in HIVE-10508

the problem with HIVE-9013 is it's about to introduce a common method for 
removing sensitive data from Configuration, which absolutely makes sense in 
most of the cases (set command showing sensitive data), but can cause issues 
e.g. while using non-secure cloud connectors on a cluster, where instead of the 
hadoop credential provider API (which is considered the secure and proper way), 
passwords/secrets appear in the Configuration object (like: 
"fs.azure.account.oauth2.client.secret")

2 possible solutions:
1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> which 
defaults to "hive.conf.hidden.list" (configurable, but maybe just more 
confusing to users, having a new config property which should be understood and 
maintained on a cluster)
2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced by 
HIVE-10508 (convenient, less confusing for users, but cannot be configured)


  was:
History in chronological order:
HIVE-10508: removed some passwords from config that's propagated to execution 
engines
HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
hardcoded list in HIVE-10508

the problem with HIVE-9013 is it's about to introduce a common method for 
removing sensitive data from Configuration, which absolutely makes sense in 
most of the cases (set command showing sensitive data), but can cause issues 
e.g. while using non-secure cloud connectors on a cluster, where instead of the 
hadoop credential provider API (which is considered the secure and proper way), 
passwords/secrets appear in the Configuration object (like: 
"fs.azure.account.oauth2.client.secret")

2 possible solutions:
1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> which 
defaults to "hive.conf.hidden.list" (configurable, but maybe just confusing, 
having a new config property which should be understood and maintained on a 
cluster)
2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced by 
HIVE-10508 (convenient, less confusing for users, but cannot be configured)



> Let secret config properties to be propagated to Tez
> 
>
> Key: HIVE-25929
> URL: https://issues.apache.org/jira/browse/HIVE-25929
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> History in chronological order:
> HIVE-10508: removed some passwords from config that's propagated to execution 
> engines
> HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
> hardcoded list in HIVE-10508
> the problem with HIVE-9013 is it's about to introduce a common method for 
> removing sensitive data from Configuration, which absolutely makes sense in 
> most of the cases (set command showing sensitive data), but can cause issues 
> e.g. while using non-secure cloud connectors on a cluster, where instead of 
> the hadoop credential provider API (which is considered the secure and proper 
> way), passwords/secrets appear in the Configuration object (like: 
> "fs.azure.account.oauth2.client.secret")
> 2 possible solutions:
> 1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> 
> which defaults to "hive.conf.hidden.list" (configurable, but maybe just more 
> confusing to users, having a new config property which should be understood 
> and maintained on a cluster)
> 2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced 
> by HIVE-10508 (convenient, less confusing for users, but cannot be configured)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25929) Let secret config properties to be propagated to Tez

2022-02-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17487688#comment-17487688
 ] 

László Bodor commented on HIVE-25929:
-

solution easy, whichever way we choose, I'm open to opinions
cc: [~anishek], [~thejas]

> Let secret config properties to be propagated to Tez
> 
>
> Key: HIVE-25929
> URL: https://issues.apache.org/jira/browse/HIVE-25929
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> History in chronological order:
> HIVE-10508: removed some passwords from config that's propagated to execution 
> engines
> HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
> hardcoded list in HIVE-10508
> the problem with HIVE-9013 is it's about to introduce a common method for 
> removing sensitive data from Configuration, which absolutely makes sense in 
> most of the cases (set command showing sensitive data), but can cause issues 
> e.g. while using non-secure cloud connectors on a cluster, where instead of 
> the hadoop credential provider API (which is considered the secure and proper 
> way), passwords/secrets appear in the Configuration object (like: 
> "fs.azure.account.oauth2.client.secret")
> 2 possible solutions:
> 1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> 
> which defaults to "hive.conf.hidden.list" (configurable, but maybe just 
> confusing, having a new config property which should be understood and 
> maintained on a cluster)
> 2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced 
> by HIVE-10508 (convenient, less confusing for users, but cannot be configured)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25929) Let secret config properties to be propagated to Tez

2022-02-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25929:

Description: 
History in chronological order:
HIVE-10508: removed some passwords from config that's propagated to execution 
engines
HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
hardcoded list in HIVE-10508

the problem with HIVE-9013 is it's about to introduce a common method for 
removing sensitive data from Configuration, which absolutely makes sense in 
most of the cases (set command showing sensitive data), but can cause issues 
e.g. while using non-secure cloud connectors on a cluster, where instead of the 
hadoop credential provider API (which is considered the secure and proper way), 
passwords/secrets appear in the Configuration object (like: 
"fs.azure.account.oauth2.client.secret")

2 possible solutions:
1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> which 
defaults to "hive.conf.hidden.list" (configurable, but maybe just confusing, 
having a new config property which should be understood and maintained on a 
cluster)
2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced by 
HIVE-10508 (convenient, less confusing for users, but cannot be configured)


  was:
History in chronological order:
HIVE-10508: removed some passwords from config that's propagated to execution 
engines
HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
hardcoded list in HIVE-10508

the problem with HIVE-9013 is it's about to introduce a common method for 
removing sensitive data from Configuration, which absolutely makes sense in 
most of the cases (set command showing sensitive data), but can cause issues 
e.g. while using non-secure cloud connectors on a cluster, where instead of the 
hadoop credential provider API (which is considered the secure and proper way), 
passwords/secrets appear in the Configuration object

2 possible solutions:
1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> which 
defaults to "hive.conf.hidden.list" (configurable, but maybe just confusing, 
having a new config property which should be understood and maintained on a 
cluster)
2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced by 
HIVE-10508 (convenient, less confusing for users, but cannot be configured)



> Let secret config properties to be propagated to Tez
> 
>
> Key: HIVE-25929
> URL: https://issues.apache.org/jira/browse/HIVE-25929
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> History in chronological order:
> HIVE-10508: removed some passwords from config that's propagated to execution 
> engines
> HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
> hardcoded list in HIVE-10508
> the problem with HIVE-9013 is it's about to introduce a common method for 
> removing sensitive data from Configuration, which absolutely makes sense in 
> most of the cases (set command showing sensitive data), but can cause issues 
> e.g. while using non-secure cloud connectors on a cluster, where instead of 
> the hadoop credential provider API (which is considered the secure and proper 
> way), passwords/secrets appear in the Configuration object (like: 
> "fs.azure.account.oauth2.client.secret")
> 2 possible solutions:
> 1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> 
> which defaults to "hive.conf.hidden.list" (configurable, but maybe just 
> confusing, having a new config property which should be understood and 
> maintained on a cluster)
> 2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced 
> by HIVE-10508 (convenient, less confusing for users, but cannot be configured)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25929) Let secret config properties to be propagated to Tez

2022-02-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25929:

Description: 
History in chronological order:
HIVE-10508: removed some passwords from config that's propagated to execution 
engines
HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
hardcoded list in HIVE-10508

the problem with HIVE-9013 is it's about to introduce a common method for 
removing sensitive data from Configuration, which absolutely makes sense in 
most of the cases (set command showing sensitive data), but can cause issues 
e.g. while using non-secure cloud connectors on a cluster, where instead of the 
hadoop credential provider API (which is considered the secure and proper way), 
passwords/secrets appear in the Configuration object

2 possible solutions:
1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> which 
defaults to "hive.conf.hidden.list" (configurable, but maybe just confusing, 
having a new config property which should be understood and maintained on a 
cluster)
2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced by 
HIVE-10508 (convenient, less confusing for users, but cannot be configured)


> Let secret config properties to be propagated to Tez
> 
>
> Key: HIVE-25929
> URL: https://issues.apache.org/jira/browse/HIVE-25929
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> History in chronological order:
> HIVE-10508: removed some passwords from config that's propagated to execution 
> engines
> HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
> hardcoded list in HIVE-10508
> the problem with HIVE-9013 is it's about to introduce a common method for 
> removing sensitive data from Configuration, which absolutely makes sense in 
> most of the cases (set command showing sensitive data), but can cause issues 
> e.g. while using non-secure cloud connectors on a cluster, where instead of 
> the hadoop credential provider API (which is considered the secure and proper 
> way), passwords/secrets appear in the Configuration object
> 2 possible solutions:
> 1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> 
> which defaults to "hive.conf.hidden.list" (configurable, but maybe just 
> confusing, having a new config property which should be understood and 
> maintained on a cluster)
> 2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced 
> by HIVE-10508 (convenient, less confusing for users, but cannot be configured)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25929) Let secret config properties to be propagated to Tez

2022-02-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-25929:
---

Assignee: László Bodor

> Let secret config properties to be propagated to Tez
> 
>
> Key: HIVE-25929
> URL: https://issues.apache.org/jira/browse/HIVE-25929
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-24545) jdbc.HiveStatement: Number of rows is greater than Integer.MAX_VALUE

2022-02-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24545?focusedWorklogId=721601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721601
 ]

ASF GitHub Bot logged work on HIVE-24545:
-

Author: ASF GitHub Bot
Created on: 06/Feb/22 08:48
Start Date: 06/Feb/22 08:48
Worklog Time Spent: 10m 
  Work Description: abstractdog opened a new pull request #1789:
URL: https://github.com/apache/hive/pull/1789


   ### What changes were proposed in this pull request?
   We should use java.sql.getLargeUpdateCount() where it's possible. 
User-facing case is beeline output.
   
   ### Why are the changes needed?
   Because this can be confusing for the user on beeline output:
   ```
   20/12/16 01:37:36 [main]: WARN jdbc.HiveStatement: Number of rows is greater 
than Integer.MAX_VALUE
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, beeline is supposed to return row numbers > Integer.MAX_VALUE properly.
   
   ### How was this patch tested?
   Not yet tested.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 721601)
Time Spent: 1h 20m  (was: 1h 10m)

> jdbc.HiveStatement: Number of rows is greater than Integer.MAX_VALUE
> 
>
> Key: HIVE-24545
> URL: https://issues.apache.org/jira/browse/HIVE-24545
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I found this while IOW on TPCDS 10TB:
> {code}
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 ..  llap SUCCEEDED   4210   421000  
>  0 362
> Reducer 2 ..  llap SUCCEEDED10110100  
>  0   2
> Reducer 3 ..  llap SUCCEEDED   1009   100900  
>  0   1
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 12613.62 s
> --
> 20/12/16 01:37:36 [main]: WARN jdbc.HiveStatement: Number of rows is greater 
> than Integer.MAX_VALUE
> {code}
> my scenario was:
> {code}
> set hive.exec.max.dynamic.partitions=2000;
> drop table if exists test_sales_2;
> create table test_sales_2 like 
> tpcds_bin_partitioned_acid_orc_1.store_sales;
> insert overwrite table test_sales_2 select * from 
> tpcds_bin_partitioned_acid_orc_1.store_sales where ss_sold_date_sk > 
> 2451868;
> {code}
> regarding affected row numbers:
> {code}
> select count(*) from tpcds_bin_partitioned_acid_orc_1.store_sales where 
> ss_sold_date_sk > 2451868;
> +--+
> | _c0  |
> +--+
> | 12287871907  |
> +--+
> {code}
> I guess we should switch to long



--
This message was sent by Atlassian Jira
(v8.20.1#820001)