date:20200921

[jira] [Work logged] (HIVE-24188) CTLT from MM to External fails because table txn properties are not skipped

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24188?focusedWorklogId=487942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487942
 ]

ASF GitHub Bot logged work on HIVE-24188:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 05:01
Start Date: 22/Sep/20 05:01
Worklog Time Spent: 10m 
  Work Description: nareshpr opened a new pull request #1516:
URL: https://github.com/apache/hive/pull/1516


   What changes were proposed in this pull request?
   Included check to skip TXN tblproperties for external table from MMM
   
   Why are the changes needed?
   CTLT is failing from MM to External
   
   Does this PR introduce any user-facing change?
   no
   
   How was this patch tested?
   Using repro sql, included it as part of testcase.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487942)
Remaining Estimate: 0h
Time Spent: 10m

> CTLT from MM to External fails because table txn properties are not skipped
> ---
>
> Key: HIVE-24188
> URL: https://issues.apache.org/jira/browse/HIVE-24188
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps
>  
> {code:java}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create table test_mm(age int, name string) partitioned by(dept string) stored 
> as orc tblproperties('transactional'='true', 
> 'transactional_properties'='default');
> create external table test_external like test_mm LOCATION 
> '${system:test.tmp.dir}/create_like_mm_to_external';
> {code}
> Fails with below exception
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:default.test_external cannot be declared transactional 
> because it's an external table) (state=08S01,code=1){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24188) CTLT from MM to External fails because table txn properties are not skipped

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24188:
--
Labels: pull-request-available  (was: )

> CTLT from MM to External fails because table txn properties are not skipped
> ---
>
> Key: HIVE-24188
> URL: https://issues.apache.org/jira/browse/HIVE-24188
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Repro steps
>  
> {code:java}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create table test_mm(age int, name string) partitioned by(dept string) stored 
> as orc tblproperties('transactional'='true', 
> 'transactional_properties'='default');
> create external table test_external like test_mm LOCATION 
> '${system:test.tmp.dir}/create_like_mm_to_external';
> {code}
> Fails with below exception
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:default.test_external cannot be declared transactional 
> because it's an external table) (state=08S01,code=1){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487930
 ]

ASF GitHub Bot logged work on HIVE-24187:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 04:25
Start Date: 22/Sep/20 04:25
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1515:
URL: https://github.com/apache/hive/pull/1515#discussion_r492467429



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java
##
@@ -1963,4 +2062,12 @@ private void setupUDFJarOnHDFS(Path 
identityUdfLocalPath, Path identityUdfHdfsPa
 FileSystem fs = primary.miniDFSCluster.getFileSystem();
 fs.copyFromLocalFile(identityUdfLocalPath, identityUdfHdfsPath);
   }
+
+  private List getHdfsNamespaceClause() {

Review comment:
   replace with nameservice





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487930)
Time Spent: 40m  (was: 0.5h)

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24187.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24188) CTLT from MM to External fails because table txn properties are not skipped

2020-09-21 Thread Naresh P R (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R reassigned HIVE-24188:
-


> CTLT from MM to External fails because table txn properties are not skipped
> ---
>
> Key: HIVE-24188
> URL: https://issues.apache.org/jira/browse/HIVE-24188
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>
> Repro steps
>  
> {code:java}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create table test_mm(age int, name string) partitioned by(dept string) stored 
> as orc tblproperties('transactional'='true', 
> 'transactional_properties'='default');
> create external table test_external like test_mm LOCATION 
> '${system:test.tmp.dir}/create_like_mm_to_external';
> {code}
> Fails with below exception
> {code:java}
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:default.test_external cannot be declared transactional 
> because it's an external table) (state=08S01,code=1){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23637) Fix FindBug issues in hive-cli

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23637?focusedWorklogId=487667=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487667
 ]

ASF GitHub Bot logged work on HIVE-23637:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:25
Start Date: 22/Sep/20 03:25
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1162:
URL: https://github.com/apache/hive/pull/1162


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487667)
Time Spent: 40m  (was: 0.5h)

> Fix FindBug issues in hive-cli
> --
>
> Key: HIVE-23637
> URL: https://issues.apache.org/jira/browse/HIVE-23637
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24163?focusedWorklogId=487658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487658
 ]

ASF GitHub Bot logged work on HIVE-24163:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:25
Start Date: 22/Sep/20 03:25
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on pull request #1507:
URL: https://github.com/apache/hive/pull/1507#issuecomment-696209500


   The file listing in the Utilities.getFullDPSpecs method was not correct for 
MM tables and for ACID tables when direct insert was on. This method returned 
all partitions from these tables, not just the ones affected by the current 
query. Because of this, the lineage information for inserting with dynamic 
partitioning into tables like these was not correct. Compared the lineage 
information with when inserting into external tables and for external tables 
only the partitions are present which are affected by the query. This is 
because for external tables when inserting into the table, the data first get 
written into the staging dir and when listing the partitions, this directory is 
checked and it contains only the newly inserted data. But for MM tables and 
ACID direct insert, the staging dir is missing, so it will check the table 
directory and lists everything from it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487658)
Time Spent: 0.5h  (was: 20m)

> Dynamic Partitioning Insert fail for MM table fail during MoveTask
> --
>
> Key: HIVE-24163
> URL: https://issues.apache.org/jira/browse/HIVE-24163
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajkumar Singh
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.2
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> -- DDLs and Query
> {code:java}
> create table `class` (name varchar(8), sex varchar(1), age double precision, 
> height double precision, weight double precision);
> insert into table class values ('RAJ','MALE',28,12,12);
> CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` 
> VARCHAR(1)) PARTITIONED BY(Weight string, Age
> string, Height string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' 
> LINES TERMINATED BY '\012' STORED AS TEXTFILE;
> INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
> `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
> {code}
> it fail during the MoveTask execution:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
> hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
>  is not a directory!
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
>  ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487718
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:29
Start Date: 22/Sep/20 03:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r491922128



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   Do we create txnManager instance here just to get the value of 
useNewLocksFormat flag?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   Do we create txnManager instance here just to get the value of 
useNewLocksFormat flag? 

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   TxnManager is initialised before query compilation and has a session 
scope. Correct way to access it is via SessionState. Looks like 
ShowDbLocksAnalyzer was always creating new instance of TxnManager.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   TxnManager is initialized before query compilation and has a session 
scope. Correct way to access it is via SessionState. Looks like 
ShowDbLocksAnalyzer was always creating new instance of TxnManager.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487718)
Time Spent: 1.5h  (was: 1h 20m)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at

[jira] [Work logged] (HIVE-23900) Replace Base64 in exec Package

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23900?focusedWorklogId=487908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487908
 ]

ASF GitHub Bot logged work on HIVE-23900:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:45
Start Date: 22/Sep/20 03:45
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1297:
URL: https://github.com/apache/hive/pull/1297#issuecomment-696455652


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487908)
Time Spent: 0.5h  (was: 20m)

> Replace Base64 in exec Package
> --
>
> Key: HIVE-23900
> URL: https://issues.apache.org/jira/browse/HIVE-23900
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487928
 ]

ASF GitHub Bot logged work on HIVE-24187:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 04:00
Start Date: 22/Sep/20 04:00
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1515:
URL: https://github.com/apache/hive/pull/1515#discussion_r492461348



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java
##
@@ -424,6 +424,20 @@ public String encodeFileUri(String fileUriStr, String 
fileChecksum, String encod
 return encodedUri;
   }
 
+  public static String encodeFileUri(String fileUriStr, String fileChecksum, 
String cmroot, String encodedSubDir) {
+String encodedUri = fileUriStr;
+if ((fileChecksum != null) && (cmroot != null)) {
+  encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + fileChecksum + 
URI_FRAGMENT_SEPARATOR + cmroot;
+} else {
+  encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + 
URI_FRAGMENT_SEPARATOR;

Review comment:
   why do we have 2 URI_FRAGMENT_SEPARATOR

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java
##
@@ -424,6 +424,20 @@ public String encodeFileUri(String fileUriStr, String 
fileChecksum, String encod
 return encodedUri;
   }
 
+  public static String encodeFileUri(String fileUriStr, String fileChecksum, 
String cmroot, String encodedSubDir) {
+String encodedUri = fileUriStr;
+if ((fileChecksum != null) && (cmroot != null)) {
+  encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + fileChecksum + 
URI_FRAGMENT_SEPARATOR + cmroot;
+} else {
+  encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + 
URI_FRAGMENT_SEPARATOR;
+}
+encodedUri = encodedUri + URI_FRAGMENT_SEPARATOR + ((encodedSubDir != 
null) ? encodedSubDir : "");
+if (LOG.isDebugEnabled()) {

Review comment:
   Do we need this check?

##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -522,6 +522,14 @@ private static void populateLlapDaemonVarsSet(Set 
llapDaemonVarsSetLocal
 REPLCMINTERVAL("hive.repl.cm.interval","3600s",
 new TimeValidator(TimeUnit.SECONDS),
 "Inteval for cmroot cleanup thread."),
+
REPL_HA_DATAPATH_REPLACE_REMOTE_NAMESERVICE("hive.repl.ha.datapath.replace.remote.nameservice",
 false,
+"When HDFS is HA enabled and both source and target clusters are 
configured with same nameservice names," +
+"enable this flag and provide a "),

Review comment:
   sentence is incomplete

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java
##
@@ -424,6 +424,20 @@ public String encodeFileUri(String fileUriStr, String 
fileChecksum, String encod
 return encodedUri;
   }
 
+  public static String encodeFileUri(String fileUriStr, String fileChecksum, 
String cmroot, String encodedSubDir) {
+String encodedUri = fileUriStr;
+if ((fileChecksum != null) && (cmroot != null)) {

Review comment:
   empty check not needed?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/Utils.java
##
@@ -72,6 +76,40 @@ public static void writeOutput(List> 
listValues, Path outputFile, H
 writeOutput(listValues, outputFile, hiveConf, false);
   }
 
+  /**
+   * Given a ReplChangeManger's encoded uri, replaces the namespace and 
returns the modified encoded uri.
+   */
+  public static String replaceNameSpaceInEncodedURI(String cmEncodedURI, 
HiveConf hiveConf) throws SemanticException {

Review comment:
   replace name service?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487928)
Time Spent: 0.5h  (was: 20m)

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24187.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Current HA is supported only

[jira] [Work logged] (HIVE-23793) Review of QueryInfo Class

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23793?focusedWorklogId=487648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487648
 ]

ASF GitHub Bot logged work on HIVE-23793:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:24
Start Date: 22/Sep/20 03:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1197:
URL: https://github.com/apache/hive/pull/1197


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487648)
Time Spent: 2h 50m  (was: 2h 40m)

> Review of QueryInfo Class
> -
>
> Key: HIVE-23793
> URL: https://issues.apache.org/jira/browse/HIVE-23793
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-16490) Hive should not use private HDFS APIs for encryption

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16490?focusedWorklogId=487725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487725
 ]

ASF GitHub Bot logged work on HIVE-16490:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:30
Start Date: 22/Sep/20 03:30
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1279:
URL: https://github.com/apache/hive/pull/1279#issuecomment-695860130


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487725)
Time Spent: 1h 40m  (was: 1.5h)

> Hive should not use private HDFS APIs for encryption
> 
>
> Key: HIVE-16490
> URL: https://issues.apache.org/jira/browse/HIVE-16490
> Project: Hive
>  Issue Type: Improvement
>  Components: Encryption
>Affects Versions: 2.2.0
>Reporter: Andrew Wang
>Assignee: Naveen Gangam
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When compiling against bleeding edge versions of Hive and Hadoop, we 
> discovered that HIVE-16047 references a private HDFS API, DFSClient, to get 
> at various encryption related information. The private API was recently 
> changed by HADOOP-14104, which broke Hive compilation.
> It'd be better to instead use publicly supported APIs. HDFS-11687 has been 
> filed to add whatever encryption APIs are needed by Hive. This JIRA is to 
> move Hive over to these new APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487707
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:29
Start Date: 22/Sep/20 03:29
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r492225545



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   That's correct. Normally, we should (but I am not 100% sure) have a 
TxnManager at this point so there is no need to create a new one just to obtain 
the flag. I pushed commit 
https://github.com/apache/hive/pull/1509/commits/297882ee80d52689a9cc1c68da9f7580918439bb
 to try this out.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   Did you check the last commit? Do you have something else in mind?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487707)
Time Spent: 1h 20m  (was: 1h 10m)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
>

[jira] [Work logged] (HIVE-23640) Fix FindBug issues in hive-druid-handler

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23640?focusedWorklogId=487527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487527
 ]

ASF GitHub Bot logged work on HIVE-23640:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:13
Start Date: 22/Sep/20 03:13
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1164:
URL: https://github.com/apache/hive/pull/1164


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487527)
Time Spent: 40m  (was: 0.5h)

> Fix FindBug issues in hive-druid-handler
> 
>
> Key: HIVE-23640
> URL: https://issues.apache.org/jira/browse/HIVE-23640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23882) Compiler should skip MJ keyExpr for probe optimization

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23882?focusedWorklogId=487755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487755
 ]

ASF GitHub Bot logged work on HIVE-23882:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:33
Start Date: 22/Sep/20 03:33
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1286:
URL: https://github.com/apache/hive/pull/1286#issuecomment-695860126


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487755)
Time Spent: 40m  (was: 0.5h)

> Compiler should skip MJ keyExpr for probe optimization
> --
>
> Key: HIVE-23882
> URL: https://issues.apache.org/jira/browse/HIVE-23882
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In probe we cannot currently support Key expressions (on the big table Side) 
> as ORC CVs Probe directly the smalltable HT (there is no expr evaluation at 
> that level).
> TezCompiler should take this into account when picking MJs to push probe 
> details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487618
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:21
Start Date: 22/Sep/20 03:21
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r492224561



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
##
@@ -63,19 +63,19 @@
 
   private VectorizationContext taskVectorizationContext;
 
-  protected transient JobConf jc;
-  private transient boolean inputFileChanged = false;
+  protected JobConf jc;

Review comment:
   I actually tried not keeping these fields but I was running into all 
sorts of issues like unable to serialize/de-serialize or plan generating 
without metadata etc. 
   I am not sure if we need to keep all of these fields or we can selectively 
choose, I went by almost all in interest of time. If Gopal or Rajesh thinks 
that this may cause performance issue I can open a follow-up to investigate and 
choose fields selectively.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -387,6 +387,12 @@
   protected volatile boolean disableJoinMerge = false;
   protected final boolean defaultJoinMerge;
 
+  /*
+   * This is used by prepare/execute statement
+   * Prepare/Execute requires operators to be copied and cached
+   */
+  protected Map topOpsCopy = null;

Review comment:
   Original operator tree shape is changed when going through physical 
transformations and task generation (don't know why though), as a result this 
operator tree can not be used later to regenerate tasks or re-running physical 
transformations. Therefore we make a copy and cache it after operator tree is 
generated.
   I will leave a comment.

##
File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out
##
@@ -84,12 +84,13 @@ Stage-0
 Select Operator [SEL_40] (rows=1 width=4)
   Output:["_col0"]
   TableScan [TS_24] (rows=1 width=4)
-Output:["id"]
+default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"]

Review comment:
   Yeah I think this is likely side effect of some changes in w.r.t 
serialization/de-serialization. Although this is positive side effect now that 
we have more information in explain plan.

##
File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out
##
@@ -84,12 +84,13 @@ Stage-0
 Select Operator [SEL_40] (rows=1 width=4)
   Output:["_col0"]
   TableScan [TS_24] (rows=1 width=4)
-Output:["id"]
+default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"]
   <-Map 6 [CONTAINS] vectorized, llap
 Reduce Output Operator [RS_45]
   Limit [LIM_44] (rows=1 width=2)
 Number of rows:1
 Select Operator [SEL_43] (rows=1 width=0)
   Output:["_col0"]
   TableScan [TS_29] (rows=1 width=0)
+default@tb2,tb2,Tbl:PARTIAL,Col:COMPLETE

Review comment:
   I confirmed that this is expected. I compared this plan against master 
(with explain.user set to false) and there is no difference in the plan.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487618)
Time Spent: 1h 20m  (was: 1h 10m)

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487584
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:18
Start Date: 22/Sep/20 03:18
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r491751273



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java
##
@@ -253,14 +191,17 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String queryName = getQueryName(root);
 if (ss.getPreparePlans().containsKey(queryName)) {
   // retrieve cached plan from session state
-  BaseSemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName);
+  SemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName);
 
   // make copy of the plan
-  createTaskCopy(cachedPlan);
+  //createTaskCopy(cachedPlan);

Review comment:
   Can remove line commented out.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementAnalyzer.java
##
@@ -54,6 +58,21 @@ private void savePlan(String queryName) throws 
SemanticException{
 ss.getPreparePlans().put(queryName, this);
   }
 
+  private  T makeCopy(final Object task, Class objClass) {
+ByteArrayOutputStream baos = new ByteArrayOutputStream();

Review comment:
   Can we leave a comment on this method to understand what it is trying to 
do?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
##
@@ -63,19 +63,19 @@
 
   private VectorizationContext taskVectorizationContext;
 
-  protected transient JobConf jc;
-  private transient boolean inputFileChanged = false;
+  protected JobConf jc;

Review comment:
   Do we need to keep all these fields for the plan cache in the operator, 
table, etc.? I am wondering about the implications of keeping them when the 
operator plan is serialized (i.e., whether that could have an performance 
impact). @t3rmin4t0r , @rbalamohan , could you comment on this?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java
##
@@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
   this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks());
   this.initCtx(cachedPlan.getCtx());
   this.ctx.setCboInfo(cachedPlan.getCboInfo());
+  this.setLoadFileWork(cachedPlan.getLoadFileWork());
+  this.setLoadTableWork(cachedPlan.getLoadTableWork());
+
+  this.setQB(cachedPlan.getQB());
+
+  ParseContext pctxt = this.getParseContext();
+  // partition pruner
+  Transform ppr = new PartitionPruner();
+  ppr.transform(pctxt);
+
+  //pctxt.setQueryProperties(this.queryProperties);
+  if (!ctx.getExplainLogical()) {
+TaskCompiler compiler = TaskCompilerFactory.getCompiler(conf, pctxt);
+compiler.init(queryState, console, db);
+compiler.compile(pctxt, rootTasks, inputs, outputs);
+fetchTask = pctxt.getFetchTask();
+//fetchTask = makeCopy(cachedPlan.getFetchTask(), 
cachedPlan.getFetchTask().getClass());

Review comment:
   This comment too.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java
##
@@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
   this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks());
   this.initCtx(cachedPlan.getCtx());
   this.ctx.setCboInfo(cachedPlan.getCboInfo());
+  this.setLoadFileWork(cachedPlan.getLoadFileWork());
+  this.setLoadTableWork(cachedPlan.getLoadTableWork());
+
+  this.setQB(cachedPlan.getQB());
+
+  ParseContext pctxt = this.getParseContext();
+  // partition pruner
+  Transform ppr = new PartitionPruner();
+  ppr.transform(pctxt);
+
+  //pctxt.setQueryProperties(this.queryProperties);

Review comment:
   Same, can be removed?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -387,6 +387,12 @@
   protected volatile boolean disableJoinMerge = false;
   protected final boolean defaultJoinMerge;
 
+  /*
+   * This is used by prepare/execute statement
+   * Prepare/Execute requires operators to be copied and cached
+   */
+  protected Map topOpsCopy = null;

Review comment:
   Why do you need to keep a copy instead of using the original operators? 
Could you leave a comment on that?

##
File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out
##
@@ -84,12 +84,13 @@ Stage-0
 Select Operator [SEL_40] (rows=1 width=4)
   Output:["_col0"]

[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487796=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487796
 ]

ASF GitHub Bot logged work on HIVE-24187:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:36
Start Date: 22/Sep/20 03:36
Worklog Time Spent: 10m 
  Work Description: pkumarsinha opened a new pull request #1515:
URL: https://github.com/apache/hive/pull/1515


   …e name on source and destination
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487796)
Time Spent: 20m  (was: 10m)

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24187.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24185) Upgrade snappy-java to 1.1.7.5

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24185?focusedWorklogId=487781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487781
 ]

ASF GitHub Bot logged work on HIVE-24185:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:35
Start Date: 22/Sep/20 03:35
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1513:
URL: https://github.com/apache/hive/pull/1513


   Change-Id: I6d314e48f96006f549974d1907a0d6de563d7250
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487781)
Time Spent: 20m  (was: 10m)

> Upgrade snappy-java to 1.1.7.5
> --
>
> Key: HIVE-24185
> URL: https://issues.apache.org/jira/browse/HIVE-24185
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Bump version to take advantage of perf improvements, glibc compatibility etc.
> https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23639) Fix FindBug issues in hive-contrib

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23639?focusedWorklogId=487872=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487872
 ]

ASF GitHub Bot logged work on HIVE-23639:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:42
Start Date: 22/Sep/20 03:42
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1163:
URL: https://github.com/apache/hive/pull/1163


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487872)
Time Spent: 40m  (was: 0.5h)

> Fix FindBug issues in hive-contrib
> --
>
> Key: HIVE-23639
> URL: https://issues.apache.org/jira/browse/HIVE-23639
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23870) Optimise multiple text conversions in WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23870?focusedWorklogId=487852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487852
 ]

ASF GitHub Bot logged work on HIVE-23870:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:40
Start Date: 22/Sep/20 03:40
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1282:
URL: https://github.com/apache/hive/pull/1282#issuecomment-696455663


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487852)
Time Spent: 1h  (was: 50m)

> Optimise multiple text conversions in 
> WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable
> ---
>
> Key: HIVE-23870
> URL: https://issues.apache.org/jira/browse/HIVE-23870
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: image-2020-07-17-11-31-38-241.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Observed this when creating materialized view.
> [https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableHiveCharObjectInspector.java#L85]
> Same content is converted to Text multiple times.
> !image-2020-07-17-11-31-38-241.png|width=1048,height=936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24184) Re-order methods in Driver

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24184?focusedWorklogId=487543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487543
 ]

ASF GitHub Bot logged work on HIVE-24184:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:14
Start Date: 22/Sep/20 03:14
Worklog Time Spent: 10m 
  Work Description: miklosgergely opened a new pull request #1512:
URL: https://github.com/apache/hive/pull/1512


   ### What changes were proposed in this pull request?
   Driver is still a huge class, with a lot of methods. They are not 
representing the order of the process done by the Driver (compilation, 
execution, result providing, closing). Also the constructors are not at the 
beginning of the class. All of these make the class harder to read. By 
re-ordering them it would be easier.
   
   ### Why are the changes needed?
   To make the Driver class cleaner, thus easier to read/understand.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   All the tests are still passing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487543)
Time Spent: 20m  (was: 10m)

> Re-order methods in Driver
> --
>
> Key: HIVE-24184
> URL: https://issues.apache.org/jira/browse/HIVE-24184
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Driver is still a huge class, with a lot of methods. They are not 
> representing the order of the process done by the Driver (compilation, 
> execution, result providing, closing). Also the constructors are not at the 
> beginning of the class. All of these make the class harder to read. By 
> re-ordering them it would be easier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=487458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487458
 ]

ASF GitHub Bot logged work on HIVE-24159:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:03
Start Date: 22/Sep/20 03:03
Worklog Time Spent: 10m 
  Work Description: abstractdog merged pull request #1495:
URL: https://github.com/apache/hive/pull/1495


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487458)
Time Spent: 50m  (was: 40m)

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
> tez.TezTask: Failed to execute tez graph.
> org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
>   at 
> org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
>  ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
>

[jira] [Work logged] (HIVE-23728) Run metastore verification tests during precommit

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23728?focusedWorklogId=487463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487463
 ]

ASF GitHub Bot logged work on HIVE-23728:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:03
Start Date: 22/Sep/20 03:03
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1154:
URL: https://github.com/apache/hive/pull/1154#issuecomment-696455704


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487463)
Time Spent: 0.5h  (was: 20m)

> Run metastore verification tests during precommit
> -
>
> Key: HIVE-23728
> URL: https://issues.apache.org/jira/browse/HIVE-23728
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23838) KafkaRecordIteratorTest is flaky

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23838?focusedWorklogId=487460=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487460
 ]

ASF GitHub Bot logged work on HIVE-23838:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:03
Start Date: 22/Sep/20 03:03
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1245:
URL: https://github.com/apache/hive/pull/1245


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487460)
Time Spent: 2h  (was: 1h 50m)

> KafkaRecordIteratorTest is flaky
> 
>
> Key: HIVE-23838
> URL: https://issues.apache.org/jira/browse/HIVE-23838
> Project: Hive
>  Issue Type: Bug
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Failed on [4th run of flaky test 
> checker|http://ci.hive.apache.org/job/hive-flaky-check/69/] with
> org.apache.kafka.common.errors.TimeoutException: Timeout expired after 
> 1milliseconds while awaiting InitProducerId



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23877) Hive on Spark incorrect partition pruning ANALYZE TABLE

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23877?focusedWorklogId=487455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487455
 ]

ASF GitHub Bot logged work on HIVE-23877:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 03:02
Start Date: 22/Sep/20 03:02
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1278:
URL: https://github.com/apache/hive/pull/1278#issuecomment-696455677


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487455)
Time Spent: 50m  (was: 40m)

> Hive on Spark incorrect partition pruning ANALYZE TABLE
> ---
>
> Key: HIVE-23877
> URL: https://issues.apache.org/jira/browse/HIVE-23877
> Project: Hive
>  Issue Type: Bug
>Reporter: Han
>Assignee: Han
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Partitions are pruned based on the partition specification in ANALYZE TABLE 
> command and cached in TableSpec.
> When compiling, It's unnecessary to use PartitionPruner.prune() to get 
> partitions again. And PartitionPruner can not prune partitions for ANALYZE 
> TABLE command, so it will get all partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24172) Fix TestMmCompactorOnMr

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24172?focusedWorklogId=487421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487421
 ]

ASF GitHub Bot logged work on HIVE-24172:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 02:59
Start Date: 22/Sep/20 02:59
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1514:
URL: https://github.com/apache/hive/pull/1514


   Setting the execution engine as MR in the driver field 
(driver.getConf().setBoolVar(...)) only affects queries in setup and teardown.
   Compaction runs using the conf field. So the execution engine needed to be 
set to MR in conf so that compaction would pick it up.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487421)
Time Spent: 20m  (was: 10m)

> Fix TestMmCompactorOnMr
> ---
>
> Key: HIVE-24172
> URL: https://issues.apache.org/jira/browse/HIVE-24172
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> test is unstable;
> http://ci.hive.apache.org/job/hive-flaky-check/112/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22098) Data loss occurs when multiple tables are join with different bucket_version

2020-09-21 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-22098:

Attachment: join_test.sql

> Data loss occurs when multiple tables are join with different bucket_version
> 
>
> Key: HIVE-22098
> URL: https://issues.apache.org/jira/browse/HIVE-22098
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Assignee: yongtaoliao
>Priority: Blocker
>  Labels: data-loss, wrongresults
> Attachments: HIVE-22098.1.patch, image-2019-08-12-18-45-15-771.png, 
> join_test.sql, table_a_data.orc, table_b_data.orc, table_c_data.orc
>
>
> When different bucketVersion of tables do join and no of reducers is greater 
> than 2, the result is incorrect (*data loss*).
>  *Scenario 1*: Three tables join. The temporary result data of table_a in the 
> first table and table_b in the second table joins result is recorded as 
> tmp_a_b, When it joins with the third table, the bucket_version=2 of the 
> table created by default after hive-3.0.0, temporary data tmp_a_b initialized 
> the bucketVerison=-1, and then ReduceSinkOperator Verketison=-1 is joined. In 
> the init method, the hash algorithm of selecting join column is selected 
> according to bucketVersion. If bucketVersion = 2 and is not an acid 
> operation, it will acquired the new algorithm of hash. Otherwise, the old 
> algorithm of hash is acquired. Because of the inconsistency of the algorithm 
> of hash, the partition of data allocation caused are different. At stage of 
> Reducer, Data with the same key can not be paired resulting in data loss.
> *Scenario 2*: create two test tables, create table 
> table_bucketversion_1(col_1 string, col_2 string) TBLPROPERTIES 
> ('bucketing_version'='1'); table_bucketversion_2(col_1 string, col_2 string) 
> TBLPROPERTIES ('bucketing_version'='2');
>  when use table_bucketversion_1 to join table_bucketversion_2, partial result 
> data will be loss due to bucketVerison is different.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22098) Data loss occurs when multiple tables are join with different bucket_version

2020-09-21 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-22098:

Attachment: (was: join_test.sql)

> Data loss occurs when multiple tables are join with different bucket_version
> 
>
> Key: HIVE-22098
> URL: https://issues.apache.org/jira/browse/HIVE-22098
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Assignee: yongtaoliao
>Priority: Blocker
>  Labels: data-loss, wrongresults
> Attachments: HIVE-22098.1.patch, image-2019-08-12-18-45-15-771.png, 
> join_test.sql, table_a_data.orc, table_b_data.orc, table_c_data.orc
>
>
> When different bucketVersion of tables do join and no of reducers is greater 
> than 2, the result is incorrect (*data loss*).
>  *Scenario 1*: Three tables join. The temporary result data of table_a in the 
> first table and table_b in the second table joins result is recorded as 
> tmp_a_b, When it joins with the third table, the bucket_version=2 of the 
> table created by default after hive-3.0.0, temporary data tmp_a_b initialized 
> the bucketVerison=-1, and then ReduceSinkOperator Verketison=-1 is joined. In 
> the init method, the hash algorithm of selecting join column is selected 
> according to bucketVersion. If bucketVersion = 2 and is not an acid 
> operation, it will acquired the new algorithm of hash. Otherwise, the old 
> algorithm of hash is acquired. Because of the inconsistency of the algorithm 
> of hash, the partition of data allocation caused are different. At stage of 
> Reducer, Data with the same key can not be paired resulting in data loss.
> *Scenario 2*: create two test tables, create table 
> table_bucketversion_1(col_1 string, col_2 string) TBLPROPERTIES 
> ('bucketing_version'='1'); table_bucketversion_2(col_1 string, col_2 string) 
> TBLPROPERTIES ('bucketing_version'='2');
>  when use table_bucketversion_1 to join table_bucketversion_2, partial result 
> data will be loss due to bucketVerison is different.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23900) Replace Base64 in exec Package

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23900?focusedWorklogId=487366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487366
 ]

ASF GitHub Bot logged work on HIVE-23900:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1297:
URL: https://github.com/apache/hive/pull/1297#issuecomment-696455652


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487366)
Time Spent: 20m  (was: 10m)

> Replace Base64 in exec Package
> --
>
> Key: HIVE-23900
> URL: https://issues.apache.org/jira/browse/HIVE-23900
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23728) Run metastore verification tests during precommit

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23728?focusedWorklogId=487367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487367
 ]

ASF GitHub Bot logged work on HIVE-23728:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1154:
URL: https://github.com/apache/hive/pull/1154#issuecomment-696455704


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487367)
Time Spent: 20m  (was: 10m)

> Run metastore verification tests during precommit
> -
>
> Key: HIVE-23728
> URL: https://issues.apache.org/jira/browse/HIVE-23728
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23877) Hive on Spark incorrect partition pruning ANALYZE TABLE

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23877?focusedWorklogId=487365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487365
 ]

ASF GitHub Bot logged work on HIVE-23877:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1278:
URL: https://github.com/apache/hive/pull/1278#issuecomment-696455677


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487365)
Time Spent: 40m  (was: 0.5h)

> Hive on Spark incorrect partition pruning ANALYZE TABLE
> ---
>
> Key: HIVE-23877
> URL: https://issues.apache.org/jira/browse/HIVE-23877
> Project: Hive
>  Issue Type: Bug
>Reporter: Han
>Assignee: Han
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Partitions are pruned based on the partition specification in ANALYZE TABLE 
> command and cached in TableSpec.
> When compiling, It's unnecessary to use PartitionPruner.prune() to get 
> partitions again. And PartitionPruner can not prune partitions for ANALYZE 
> TABLE command, so it will get all partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23870) Optimise multiple text conversions in WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23870?focusedWorklogId=487364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487364
 ]

ASF GitHub Bot logged work on HIVE-23870:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1282:
URL: https://github.com/apache/hive/pull/1282#issuecomment-696455663


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487364)
Time Spent: 50m  (was: 40m)

> Optimise multiple text conversions in 
> WritableHiveCharObjectInspector.getPrimitiveJavaObject / HiveCharWritable
> ---
>
> Key: HIVE-23870
> URL: https://issues.apache.org/jira/browse/HIVE-23870
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: image-2020-07-17-11-31-38-241.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Observed this when creating materialized view.
> [https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableHiveCharObjectInspector.java#L85]
> Same content is converted to Text multiple times.
> !image-2020-07-17-11-31-38-241.png|width=1048,height=936!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23640) Fix FindBug issues in hive-druid-handler

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23640?focusedWorklogId=487370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487370
 ]

ASF GitHub Bot logged work on HIVE-23640:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1164:
URL: https://github.com/apache/hive/pull/1164


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487370)
Time Spent: 0.5h  (was: 20m)

> Fix FindBug issues in hive-druid-handler
> 
>
> Key: HIVE-23640
> URL: https://issues.apache.org/jira/browse/HIVE-23640
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23637) Fix FindBug issues in hive-cli

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23637?focusedWorklogId=487368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487368
 ]

ASF GitHub Bot logged work on HIVE-23637:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1162:
URL: https://github.com/apache/hive/pull/1162


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487368)
Time Spent: 0.5h  (was: 20m)

> Fix FindBug issues in hive-cli
> --
>
> Key: HIVE-23637
> URL: https://issues.apache.org/jira/browse/HIVE-23637
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23639) Fix FindBug issues in hive-contrib

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23639?focusedWorklogId=487369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487369
 ]

ASF GitHub Bot logged work on HIVE-23639:
-

Author: ASF GitHub Bot
Created on: 22/Sep/20 00:48
Start Date: 22/Sep/20 00:48
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1163:
URL: https://github.com/apache/hive/pull/1163


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487369)
Time Spent: 0.5h  (was: 20m)

> Fix FindBug issues in hive-contrib
> --
>
> Key: HIVE-23639
> URL: https://issues.apache.org/jira/browse/HIVE-23639
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: spotbugsXml.xml
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?focusedWorklogId=487337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487337
 ]

ASF GitHub Bot logged work on HIVE-24187:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 23:42
Start Date: 21/Sep/20 23:42
Worklog Time Spent: 10m 
  Work Description: pkumarsinha opened a new pull request #1515:
URL: https://github.com/apache/hive/pull/1515


   …e name on source and destination
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487337)
Remaining Estimate: 0h
Time Spent: 10m

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
> Attachments: HIVE-24187.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24187:
--
Labels: pull-request-available  (was: )

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24187.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24187:

Status: Patch Available  (was: Open)

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
> Attachments: HIVE-24187.01.patch
>
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24187:

Attachment: HIVE-24187.01.patch

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
> Attachments: HIVE-24187.01.patch
>
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-21 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha updated HIVE-24187:

Summary: Handle _files creation for HA config with same nameservice name on 
source and destination  (was: Handle _files creation for HA config with same 
nameservice on source and destination)

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24187) Handle _files creation for HA config with same nameservice on source and destination

2020-09-21 Thread Pravin Sinha (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha reassigned HIVE-24187:
---


> Handle _files creation for HA config with same nameservice on source and 
> destination
> 
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487298
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 21:47
Start Date: 21/Sep/20 21:47
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r492364790



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   Did you check the last commit? Do you have something else in mind?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487298)
Time Spent: 1h 10m  (was: 1h)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS
> at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
> at 
>

[jira] [Resolved] (HIVE-23271) Can't start Hive Interactive Server in HDP 3.1.4 Cluster

2020-09-21 Thread Ankur Tagra (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Tagra resolved HIVE-23271.

Resolution: Fixed

> Can't start Hive Interactive Server in HDP 3.1.4 Cluster
> 
>
> Key: HIVE-23271
> URL: https://issues.apache.org/jira/browse/HIVE-23271
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 3.1.0
> Environment: All nodes have CentOS 7
> Cluster HDP 3.1.4
>Reporter: Gerardo Adrián Aguirre Vivar
>Assignee: Gerardo Adrián Aguirre Vivar
>Priority: Major
>
> Hive interactive server is not working.  The installation guide has  been 
> followed 
> ([https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/performance-tuning/content/hive_prepare_to_tune_performance.html])
>  but when the server try to start, an errors appears. 
>  
> LOGS:
>  
> 2020-04-22T16:43:48,271 INFO [main]: impl.YarnClientImpl 
> (YarnClientImpl.java:submitApplication(306)) - Submitted application 
> application_1587555843754_0015
> 2020-04-22T16:43:48,275 INFO [main]: client.TezClient 
> (TezClient.java:start(404)) - The url to track the Tez Session: 
> http://:8088/proxy/application_1587555843754_0015/
> 2020-04-22T16:43:53,435 INFO [main]: client.TezClient 
> (TezClient.java:getAppMasterStatus(881)) - *{color:#0747a6}Failed to retrieve 
> AM Status via proxy{color}*
> com.google.protobuf.ServiceException: java.io.EOFException: End of File 
> Exception between local host is: 
> "/10.22.39.12"; destination host is: 
> "":33889; : java.io.EOFException; For more details see: 
> http://wiki.apache.org/hadoop/EOFException
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:242)
>  ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
>  at com.sun.proxy.$Proxy75.getAMStatus(Unknown Source) ~[?:?]
>  at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:874) 
> [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315]
>  at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:1011) 
> [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315]
>  at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:982) 
> [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:536)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:451)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:373)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:236)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startInitialSession(TezSessionPool.java:354)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:166)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:112)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:855)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:828)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:752) 
> [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1078)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.access$1700(HiveServer2.java:136) 
> [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1346)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1190) 
> [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at

[jira] [Reopened] (HIVE-23271) Can't start Hive Interactive Server in HDP 3.1.4 Cluster

2020-09-21 Thread Ankur Tagra (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Tagra reopened HIVE-23271:


> Can't start Hive Interactive Server in HDP 3.1.4 Cluster
> 
>
> Key: HIVE-23271
> URL: https://issues.apache.org/jira/browse/HIVE-23271
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 3.1.0
> Environment: All nodes have CentOS 7
> Cluster HDP 3.1.4
>Reporter: Gerardo Adrián Aguirre Vivar
>Assignee: Gerardo Adrián Aguirre Vivar
>Priority: Major
>
> Hive interactive server is not working.  The installation guide has  been 
> followed 
> ([https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.4/performance-tuning/content/hive_prepare_to_tune_performance.html])
>  but when the server try to start, an errors appears. 
>  
> LOGS:
>  
> 2020-04-22T16:43:48,271 INFO [main]: impl.YarnClientImpl 
> (YarnClientImpl.java:submitApplication(306)) - Submitted application 
> application_1587555843754_0015
> 2020-04-22T16:43:48,275 INFO [main]: client.TezClient 
> (TezClient.java:start(404)) - The url to track the Tez Session: 
> http://:8088/proxy/application_1587555843754_0015/
> 2020-04-22T16:43:53,435 INFO [main]: client.TezClient 
> (TezClient.java:getAppMasterStatus(881)) - *{color:#0747a6}Failed to retrieve 
> AM Status via proxy{color}*
> com.google.protobuf.ServiceException: java.io.EOFException: End of File 
> Exception between local host is: 
> "/10.22.39.12"; destination host is: 
> "":33889; : java.io.EOFException; For more details see: 
> http://wiki.apache.org/hadoop/EOFException
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:242)
>  ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?]
>  at com.sun.proxy.$Proxy75.getAMStatus(Unknown Source) ~[?:?]
>  at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:874) 
> [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315]
>  at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:1011) 
> [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315]
>  at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:982) 
> [tez-api-0.9.1.3.1.4.0-315.jar:0.9.1.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:536)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:451)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:373)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:236)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startInitialSession(TezSessionPool.java:354)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:166)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:112)
>  [hive-exec-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:855)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:828)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:752) 
> [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1078)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2.access$1700(HiveServer2.java:136) 
> [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1346)
>  [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1190) 
> [hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315]
>  at

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487172
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 18:29
Start Date: 21/Sep/20 18:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r492262911



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   TxnManager is initialized before query compilation and has a session 
scope. Correct way to access it is via SessionState. Looks like 
ShowDbLocksAnalyzer was always creating new instance of TxnManager.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487172)
Time Spent: 1h  (was: 50m)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS
> at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
> at 
>

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487170
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 18:28
Start Date: 21/Sep/20 18:28
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r492262911



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   TxnManager is initialised before query compilation and has a session 
scope. Correct way to access it is via SessionState. Looks like 
ShowDbLocksAnalyzer was always creating new instance of TxnManager.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487170)
Time Spent: 50m  (was: 40m)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS
> at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
> at 
>

[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487125
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 17:48
Start Date: 21/Sep/20 17:48
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r492240115



##
File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out
##
@@ -84,12 +84,13 @@ Stage-0
 Select Operator [SEL_40] (rows=1 width=4)
   Output:["_col0"]
   TableScan [TS_24] (rows=1 width=4)
-Output:["id"]
+default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"]
   <-Map 6 [CONTAINS] vectorized, llap
 Reduce Output Operator [RS_45]
   Limit [LIM_44] (rows=1 width=2)
 Number of rows:1
 Select Operator [SEL_43] (rows=1 width=0)
   Output:["_col0"]
   TableScan [TS_29] (rows=1 width=0)
+default@tb2,tb2,Tbl:PARTIAL,Col:COMPLETE

Review comment:
   I confirmed that this is expected. I compared this plan against master 
(with explain.user set to false) and there is no difference in the plan.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487125)
Time Spent: 1h  (was: 50m)

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487109
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 17:30
Start Date: 21/Sep/20 17:30
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r492229758



##
File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out
##
@@ -84,12 +84,13 @@ Stage-0
 Select Operator [SEL_40] (rows=1 width=4)
   Output:["_col0"]
   TableScan [TS_24] (rows=1 width=4)
-Output:["id"]
+default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"]

Review comment:
   Yeah I think this is likely side effect of some changes in w.r.t 
serialization/de-serialization. Although this is positive side effect now that 
we have more information in explain plan.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487109)
Time Spent: 50m  (was: 40m)

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199545#comment-17199545
 ] 

Stamatis Zampetakis commented on HIVE-24122:


Great, one problem less to deal with :)

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Assignee: GuangMing Lu
>Priority: Major
> Fix For: 4.0.0
>
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487107
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 17:29
Start Date: 21/Sep/20 17:29
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r492229067



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -387,6 +387,12 @@
   protected volatile boolean disableJoinMerge = false;
   protected final boolean defaultJoinMerge;
 
+  /*
+   * This is used by prepare/execute statement
+   * Prepare/Execute requires operators to be copied and cached
+   */
+  protected Map topOpsCopy = null;

Review comment:
   Original operator tree shape is changed when going through physical 
transformations and task generation (don't know why though), as a result this 
operator tree can not be used later to regenerate tasks or re-running physical 
transformations. Therefore we make a copy and cache it after operator tree is 
generated.
   I will leave a comment.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487107)
Time Spent: 40m  (was: 0.5h)

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=487103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487103
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 17:23
Start Date: 21/Sep/20 17:23
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r492225545



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   That's correct. Normally, we should (but I am not 100% sure) have a 
TxnManager at this point so there is no need to create a new one just to obtain 
the flag. I pushed commit 
https://github.com/apache/hive/pull/1509/commits/297882ee80d52689a9cc1c68da9f7580918439bb
 to try this out.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487103)
Time Spent: 40m  (was: 0.5h)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS
> at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
> at 
>

[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487101=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487101
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 17:22
Start Date: 21/Sep/20 17:22
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r492224561



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
##
@@ -63,19 +63,19 @@
 
   private VectorizationContext taskVectorizationContext;
 
-  protected transient JobConf jc;
-  private transient boolean inputFileChanged = false;
+  protected JobConf jc;

Review comment:
   I actually tried not keeping these fields but I was running into all 
sorts of issues like unable to serialize/de-serialize or plan generating 
without metadata etc. 
   I am not sure if we need to keep all of these fields or we can selectively 
choose, I went by almost all in interest of time. If Gopal or Rajesh thinks 
that this may cause performance issue I can open a follow-up to investigate and 
choose fields selectively.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487101)
Time Spent: 0.5h  (was: 20m)

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask

2020-09-21 Thread Marta Kuczora (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199477#comment-17199477
 ] 

Marta Kuczora commented on HIVE-24163:
--

There was a typo in the direct insert path. But turned out that there is more 
issue. 
The file listing in the Utilities.getFullDPSpecs method was not correct for MM 
tables and for ACID tables when direct insert was on. This method returned all 
partitions from these tables, not just the ones affected by the current query. 
Because of this, the lineage information for inserting with dynamic 
partitioning into tables like these was not correct. Compared the lineage 
information with when inserting into external tables and for external tables 
only the partitions are present which are affected by the query. This is 
because for external tables when inserting into the table, the data first get 
written into the staging dir and when listing the partitions, this directory is 
checked and it contains only the newly inserted data. But for MM tables and 
ACID direct insert, the staging dir is missing, so it will check the table 
directory and lists everything from it.

> Dynamic Partitioning Insert fail for MM table fail during MoveTask
> --
>
> Key: HIVE-24163
> URL: https://issues.apache.org/jira/browse/HIVE-24163
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajkumar Singh
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> -- DDLs and Query
> {code:java}
> create table `class` (name varchar(8), sex varchar(1), age double precision, 
> height double precision, weight double precision);
> insert into table class values ('RAJ','MALE',28,12,12);
> CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` 
> VARCHAR(1)) PARTITIONED BY(Weight string, Age
> string, Height string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' 
> LINES TERMINATED BY '\012' STORED AS TEXTFILE;
> INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
> `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
> {code}
> it fail during the MoveTask execution:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
> hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
>  is not a directory!
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
>  ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
>  ~[hive-service-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> {code}
> The reason is Task write the fsstat during the FileSinkOperator closing, HS2 
> ran the MoveTask to move data into the

[jira] [Work logged] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24163?focusedWorklogId=487020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487020
 ]

ASF GitHub Bot logged work on HIVE-24163:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 15:58
Start Date: 21/Sep/20 15:58
Worklog Time Spent: 10m 
  Work Description: kuczoram commented on pull request #1507:
URL: https://github.com/apache/hive/pull/1507#issuecomment-696209500


   The file listing in the Utilities.getFullDPSpecs method was not correct for 
MM tables and for ACID tables when direct insert was on. This method returned 
all partitions from these tables, not just the ones affected by the current 
query. Because of this, the lineage information for inserting with dynamic 
partitioning into tables like these was not correct. Compared the lineage 
information with when inserting into external tables and for external tables 
only the partitions are present which are affected by the query. This is 
because for external tables when inserting into the table, the data first get 
written into the staging dir and when listing the partitions, this directory is 
checked and it contains only the newly inserted data. But for MM tables and 
ACID direct insert, the staging dir is missing, so it will check the table 
directory and lists everything from it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 487020)
Time Spent: 20m  (was: 10m)

> Dynamic Partitioning Insert fail for MM table fail during MoveTask
> --
>
> Key: HIVE-24163
> URL: https://issues.apache.org/jira/browse/HIVE-24163
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajkumar Singh
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> -- DDLs and Query
> {code:java}
> create table `class` (name varchar(8), sex varchar(1), age double precision, 
> height double precision, weight double precision);
> insert into table class values ('RAJ','MALE',28,12,12);
> CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` 
> VARCHAR(1)) PARTITIONED BY(Weight string, Age
> string, Height string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' 
> LINES TERMINATED BY '\012' STORED AS TEXTFILE;
> INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
> `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
> {code}
> it fail during the MoveTask execution:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
> hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
>  is not a directory!
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
>  ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at

[jira] [Commented] (HIVE-24060) When the CBO is false, NPE is thrown by an EXCEPT or INTERSECT execution

2020-09-21 Thread GuangMing Lu (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199373#comment-17199373
 ] 

GuangMing Lu commented on HIVE-24060:
-

Hey [~dengzh] Such is the case, but hive-1.2.1 is available, which leads to 
incompatibility problems for some users, whether we need to consider it

> When the CBO is false, NPE is thrown by an EXCEPT or INTERSECT execution
> 
>
> Key: HIVE-24060
> URL: https://issues.apache.org/jira/browse/HIVE-24060
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Hive
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Priority: Major
>
> {code:java}
> set hive.cbo.enable=false;
> create table testtable(idx string, namex string) stored as orc;
> insert into testtable values('123', 'aaa'), ('234', 'bbb');
> explain select a.idx from (select idx,namex from testtable intersect select 
> idx,namex from testtable) a
> {code}
>  The execution throws a NullPointException:
> {code:java}
> 2020-08-24 15:12:24,261 | WARN  | HiveServer2-Handler-Pool: Thread-345 | 
> Error executing statement:  | 
> org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1155)
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: NullPointerException null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) 
> ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280)
>  ~[hive-service-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
>  ~[hive-service-rpc-3.1.0.jar:3.1.0]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
>  ~[hive-service-rpc-3.1.0.jar:3.1.0]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[libthrift-0.9.3.jar:0.9.3]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[libthrift-0.9.3.jar:0.9.3]
> at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:648)
>  ~[hive-standalone-metastore-3.1.0.jar:3.1.0]
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[libthrift-0.9.3.jar:0.9.3]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_201]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_201]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4367)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4346)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:10576)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10515)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11434)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11291)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11318)
>  ~[hive-exec-3.1.0.jar:3.1.0]
> at 
>

[jira] [Updated] (HIVE-24186) The aggregate class operation fails when the CBO is false

2020-09-21 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-24186:

Affects Version/s: 3.1.2

> The aggregate class operation fails when the CBO is false
> -
>
> Key: HIVE-24186
> URL: https://issues.apache.org/jira/browse/HIVE-24186
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, SQL
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Priority: Major
>
> {code:java}
> create table table_1
> (
> idx string, 
> namex string
> ) stored as orc;
> create table table_2
> (
> sid string,
> sname string
> )stored as orc;
> set hive.cbo.enable=false;
> explain
> insert into table table_1(idx , namex)
> select t.sid idx, '123' namex 
> from table_2 t
> group by t.sid
> order by 1,2;
> {code}
> Executing the above SQL will report an error, errors as follows:
> {code:java}
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: SemanticException [Error 10004]: Line 4:7 Invalid table 
> alias or column reference 't': (possible column names are: _col0, _col1)
>     at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) 
> ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) 
> ~[?:?]
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_242]
>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_242]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_242]
>     at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_242]
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737)
>  ~[hadoop-common-3.1.1-hw-ei-302001-SNAPSHOT.jar:?]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at com.sun.proxy.$Proxy66.executeStatementAsync(Unknown Source) ~[?:?]
>     at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
>  
> ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
>  
> ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at

[jira] [Updated] (HIVE-24186) The aggregate class operation fails when the CBO is false

2020-09-21 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-24186:

Fix Version/s: (was: 3.1.2)
   (was: 3.1.0)

> The aggregate class operation fails when the CBO is false
> -
>
> Key: HIVE-24186
> URL: https://issues.apache.org/jira/browse/HIVE-24186
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, SQL
>Affects Versions: 3.1.0
>Reporter: GuangMing Lu
>Priority: Major
>
> {code:java}
> create table table_1
> (
> idx string, 
> namex string
> ) stored as orc;
> create table table_2
> (
> sid string,
> sname string
> )stored as orc;
> set hive.cbo.enable=false;
> explain
> insert into table table_1(idx , namex)
> select t.sid idx, '123' namex 
> from table_2 t
> group by t.sid
> order by 1,2;
> {code}
> Executing the above SQL will report an error, errors as follows:
> {code:java}
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: SemanticException [Error 10004]: Line 4:7 Invalid table 
> alias or column reference 't': (possible column names are: _col0, _col1)
>     at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) 
> ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) 
> ~[?:?]
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_242]
>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_242]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_242]
>     at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_242]
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737)
>  ~[hadoop-common-3.1.1-hw-ei-302001-SNAPSHOT.jar:?]
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at com.sun.proxy.$Proxy66.executeStatementAsync(Unknown Source) ~[?:?]
>     at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280)
>  ~[hive-service-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
>  
> ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
>  
> ~[hive-service-rpc-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-3.1.0-hw-ei-302001-SNAPSHOT.jar:3.1.0-hw-ei-302001-SNAPSHOT]
>     at

[jira] [Reopened] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reopened HIVE-24122:
---

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Assignee: GuangMing Lu
>Priority: Major
> Fix For: 4.0.0
>
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-24122.
---
Resolution: Not A Problem

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Assignee: GuangMing Lu
>Priority: Major
> Fix For: 4.0.0
>
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu reassigned HIVE-24122:
---

Assignee: GuangMing Lu

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Assignee: GuangMing Lu
>Priority: Major
> Fix For: 4.0.0
>
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread GuangMing Lu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu resolved HIVE-24122.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Priority: Major
> Fix For: 4.0.0
>
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread GuangMing Lu (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199362#comment-17199362
 ] 

GuangMing Lu edited comment on HIVE-24122 at 9/21/20, 12:44 PM:


Hey {color:#0066cc} [~zabetak]  {color}

{color:#0066cc}{color:#172b4d} Thanks for reminding me that I was test in the 
master is ok, the reason why the master used calcite-1.21.{color} {color} After 
analysis, the problem was fixed in calcite 1.19 or above


was (Author: luguangming):
Hey {color:#0066cc} [~zabetak]  {color:#172b4d} Thanks for reminding me that I 
was test in the master is ok, the reason why the master used calcite-1.21. 
{color}{color}

{color:#0066cc}{color:#172b4d}After analysis, the problem was fixed in calcite 
1.19 or above{color}{color}

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Priority: Major
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24122) When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong

2020-09-21 Thread GuangMing Lu (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199362#comment-17199362
 ] 

GuangMing Lu commented on HIVE-24122:
-

Hey {color:#0066cc} [~zabetak]  {color:#172b4d} Thanks for reminding me that I 
was test in the master is ok, the reason why the master used calcite-1.21. 
{color}{color}

{color:#0066cc}{color:#172b4d}After analysis, the problem was fixed in calcite 
1.19 or above{color}{color}

> When CBO is enable, CAST(STR as Bigint)IS NOT NULL result is wrong 
> ---
>
> Key: HIVE-24122
> URL: https://issues.apache.org/jira/browse/HIVE-24122
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0, 3.1.2
>Reporter: GuangMing Lu
>Priority: Major
>
> {code:java}
> create  database testdb;
> CREATE TABLE IF NOT EXISTS testdb.z_tab 
> ( 
>     SEARCHWORD    STRING, 
>     COUNT_NUM BIGINT, 
>     WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
> STORED AS TEXTFILE;
> insert into table testdb.z_tab 
> values('hivetest',111,'aaa'),('hivetest2',111,'bbb');
> set hive.cbo.enable=true;
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;
> SELECT CAST(searchword as bigint) IS NULL FROM testdb.z_tab;
> {code}
> The SQL results for both queries are the same， as follows:
> {noformat}
> +---+
> |  _c0  |
> +---+
> | true  |
> | true  |
> +---+{noformat}
> SELECT CAST(searchword as bigint) IS NOT NULL FROM testdb.z_tab;  execute 
> result is wrong
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-21 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199355#comment-17199355
 ] 

László Bodor commented on HIVE-24159:
-

PR merged, thanks [~ashutoshc] for the review!

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
> tez.TezTask: Failed to execute tez graph.
> org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
>   at 
> org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
>  ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465)
>  [classes/:?]
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) 
> [classes/:?]
>   at 
>

[jira] [Updated] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-21 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24159:

Fix Version/s: 4.0.0

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
> tez.TezTask: Failed to execute tez graph.
> org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
>   at 
> org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
>  ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465)
>  [classes/:?]
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) 
> [classes/:?]
>   at 
>

[jira] [Resolved] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-21 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-24159.
-
Resolution: Fixed

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
> tez.TezTask: Failed to execute tez graph.
> org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
>   at 
> org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
>  ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) 
> [hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465)
>  [classes/:?]
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) 
> [classes/:?]
>   at 
>

[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=486885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486885
 ]

ASF GitHub Bot logged work on HIVE-24159:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 12:16
Start Date: 21/Sep/20 12:16
Worklog Time Spent: 10m 
  Work Description: abstractdog merged pull request #1495:
URL: https://github.com/apache/hive/pull/1495


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 486885)
Time Spent: 40m  (was: 0.5h)

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
> tez.TezTask: Failed to execute tez graph.
> org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
>   at 
> org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
>  ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) 
>

[jira] [Work logged] (HIVE-24172) Fix TestMmCompactorOnMr

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24172?focusedWorklogId=486882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486882
 ]

ASF GitHub Bot logged work on HIVE-24172:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 12:14
Start Date: 21/Sep/20 12:14
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1514:
URL: https://github.com/apache/hive/pull/1514


   Setting the execution engine as MR in the driver field 
(driver.getConf().setBoolVar(...)) only affects queries in setup and teardown.
   Compaction runs using the conf field. So the execution engine needed to be 
set to MR in conf so that compaction would pick it up.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 486882)
Remaining Estimate: 0h
Time Spent: 10m

> Fix TestMmCompactorOnMr
> ---
>
> Key: HIVE-24172
> URL: https://issues.apache.org/jira/browse/HIVE-24172
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> test is unstable;
> http://ci.hive.apache.org/job/hive-flaky-check/112/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24172) Fix TestMmCompactorOnMr

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24172:
--
Labels: pull-request-available  (was: )

> Fix TestMmCompactorOnMr
> ---
>
> Key: HIVE-24172
> URL: https://issues.apache.org/jira/browse/HIVE-24172
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> test is unstable;
> http://ci.hive.apache.org/job/hive-flaky-check/112/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24185) Upgrade snappy-java to 1.1.7.5

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24185:
--
Labels: pull-request-available  (was: )

> Upgrade snappy-java to 1.1.7.5
> --
>
> Key: HIVE-24185
> URL: https://issues.apache.org/jira/browse/HIVE-24185
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Bump version to take advantage of perf improvements, glibc compatibility etc.
> https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24185) Upgrade snappy-java to 1.1.7.5

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24185?focusedWorklogId=486880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486880
 ]

ASF GitHub Bot logged work on HIVE-24185:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 11:58
Start Date: 21/Sep/20 11:58
Worklog Time Spent: 10m 
  Work Description: pgaref opened a new pull request #1513:
URL: https://github.com/apache/hive/pull/1513


   Change-Id: I6d314e48f96006f549974d1907a0d6de563d7250
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 486880)
Remaining Estimate: 0h
Time Spent: 10m

> Upgrade snappy-java to 1.1.7.5
> --
>
> Key: HIVE-24185
> URL: https://issues.apache.org/jira/browse/HIVE-24185
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Bump version to take advantage of perf improvements, glibc compatibility etc.
> https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24185) Upgrade snappy-java to 1.1.7.5

2020-09-21 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-24185:
-


> Upgrade snappy-java to 1.1.7.5
> --
>
> Key: HIVE-24185
> URL: https://issues.apache.org/jira/browse/HIVE-24185
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>
> Bump version to take advantage of perf improvements, glibc compatibility etc.
> https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-117-2017-11-30



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24172) Fix TestMmCompactorOnMr

2020-09-21 Thread Karen Coppage (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199335#comment-17199335
 ] 

Karen Coppage commented on HIVE-24172:
--

http://ci.hive.apache.org/job/hive-flaky-check/115/

> Fix TestMmCompactorOnMr
> ---
>
> Key: HIVE-24172
> URL: https://issues.apache.org/jira/browse/HIVE-24172
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Karen Coppage
>Priority: Major
>
> test is unstable;
> http://ci.hive.apache.org/job/hive-flaky-check/112/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=486867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486867
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 11:24
Start Date: 21/Sep/20 11:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r491922128



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   Do we create txnManager instance here just to get the value of 
useNewLocksFormat flag? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 486867)
Time Spent: 0.5h  (was: 20m)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS
> at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
> at 
>

[jira] [Comment Edited] (HIVE-21964) jdbc handler class cast exception

2020-09-21 Thread chenruotao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318
 ] 

chenruotao edited comment on HIVE-21964 at 9/21/20, 10:35 AM:
--

I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:
 // if (rowVal instanceof java.sql.Date){ 

// LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); 
//rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth();
 // }

else

{ // rowVal = Date.valueOf (rowVal.toString()); // }

rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:
 // if (rowVal instanceof java.sql.Timestamp)

{ // LocalDateTime localDateTime = ((java.sql.Timestamp) 
rowVal).toLocalDateTime(); 
//rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano());
 // }

else

{ // rowVal = Timestamp.valueOf (rowVal.toString()); // }

rowVal = Timestamp.valueOf (rowVal.toString());

 


was (Author: chenruotao):
I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:
 // if (rowVal instanceof java.sql.Date){

// LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate();

//rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth();

// }else{

// rowVal = Date.valueOf (rowVal.toString());

// }

rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:
 // if (rowVal instanceof java.sql.Timestamp){

// LocalDateTime localDateTime = ((java.sql.Timestamp) 
rowVal).toLocalDateTime();

//rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano());

// }else{

// rowVal = Timestamp.valueOf (rowVal.toString());

// }

rowVal = Timestamp.valueOf (rowVal.toString());

 

> jdbc handler class cast exception
> -
>
> Key: HIVE-21964
> URL: https://issues.apache.org/jira/browse/HIVE-21964
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 3.1.1
>Reporter: Aloys Zhang
>Priority: Major
>
> Using hive jdbc handler to query external mysql data source with type decimal 
> type,  it throws class cast Exception :
>  
> {code:java}
> 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at 
> org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at

[jira] [Comment Edited] (HIVE-21964) jdbc handler class cast exception

2020-09-21 Thread chenruotao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318
 ] 

chenruotao edited comment on HIVE-21964 at 9/21/20, 10:35 AM:
--

I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:

rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:

rowVal = Timestamp.valueOf (rowVal.toString());

 


was (Author: chenruotao):
I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:
 // if (rowVal instanceof java.sql.Date){ 

// LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate(); 
//rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth();
 // }

else

{ // rowVal = Date.valueOf (rowVal.toString()); // }

rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:
 // if (rowVal instanceof java.sql.Timestamp)

{ // LocalDateTime localDateTime = ((java.sql.Timestamp) 
rowVal).toLocalDateTime(); 
//rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano());
 // }

else

{ // rowVal = Timestamp.valueOf (rowVal.toString()); // }

rowVal = Timestamp.valueOf (rowVal.toString());

 

> jdbc handler class cast exception
> -
>
> Key: HIVE-21964
> URL: https://issues.apache.org/jira/browse/HIVE-21964
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 3.1.1
>Reporter: Aloys Zhang
>Priority: Major
>
> Using hive jdbc handler to query external mysql data source with type decimal 
> type,  it throws class cast Exception :
>  
> {code:java}
> 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at 
> org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> ... 14 more
> Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast 
> to org.apache.hadoop.hive.common.type.HiveDecimal
> at 
>

[jira] [Comment Edited] (HIVE-21964) jdbc handler class cast exception

2020-09-21 Thread chenruotao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318
 ] 

chenruotao edited comment on HIVE-21964 at 9/21/20, 10:35 AM:
--

I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:
 // if (rowVal instanceof java.sql.Date){

// LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate();

//rowVal=Date.of(localDate.getYear(),localDate.getMonthValue(),localDate.getDayOfMonth();

// }else{

// rowVal = Date.valueOf (rowVal.toString());

// }

rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:
 // if (rowVal instanceof java.sql.Timestamp){

// LocalDateTime localDateTime = ((java.sql.Timestamp) 
rowVal).toLocalDateTime();

//rowVal=Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC),localDateTime.geNano());

// }else{

// rowVal = Timestamp.valueOf (rowVal.toString());

// }

rowVal = Timestamp.valueOf (rowVal.toString());

 


was (Author: chenruotao):
I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:
// if (rowVal instanceof java.sql.Date) {
// LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate();
// rowVal = Date.of(localDate.getYear(), localDate.getMonthValue(), 
localDate.getDayOfMonth());
// } else {
// rowVal = Date.valueOf (rowVal.toString());
// }
 rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:
// if (rowVal instanceof java.sql.Timestamp) {
// LocalDateTime localDateTime = ((java.sql.Timestamp) 
rowVal).toLocalDateTime();
// rowVal = Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC), 
localDateTime.getNano());
// } else {
// rowVal = Timestamp.valueOf (rowVal.toString());
// }
 rowVal = Timestamp.valueOf (rowVal.toString());

 

> jdbc handler class cast exception
> -
>
> Key: HIVE-21964
> URL: https://issues.apache.org/jira/browse/HIVE-21964
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 3.1.1
>Reporter: Aloys Zhang
>Priority: Major
>
> Using hive jdbc handler to query external mysql data source with type decimal 
> type,  it throws class cast Exception :
>  
> {code:java}
> 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at 
> org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at

[jira] [Commented] (HIVE-21964) jdbc handler class cast exception

2020-09-21 Thread chenruotao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199318#comment-17199318
 ] 

chenruotao commented on HIVE-21964:
---

I has the same problem like this, but not Decimal. 

the type of date and timestramp would throw the same exception when job 
running, so I do not use the type provided by 
hive(org.apache.hadoop.hive.common.type), and it worked. the code like this :

case DATE:
// if (rowVal instanceof java.sql.Date) {
// LocalDate localDate = ((java.sql.Date) rowVal).toLocalDate();
// rowVal = Date.of(localDate.getYear(), localDate.getMonthValue(), 
localDate.getDayOfMonth());
// } else {
// rowVal = Date.valueOf (rowVal.toString());
// }
 rowVal = Date.valueOf (rowVal.toString());
 break;
 case TIMESTAMP:
// if (rowVal instanceof java.sql.Timestamp) {
// LocalDateTime localDateTime = ((java.sql.Timestamp) 
rowVal).toLocalDateTime();
// rowVal = Timestamp.ofEpochSecond(localDateTime.toEpochSecond(UTC), 
localDateTime.getNano());
// } else {
// rowVal = Timestamp.valueOf (rowVal.toString());
// }
 rowVal = Timestamp.valueOf (rowVal.toString());

 

> jdbc handler class cast exception
> -
>
> Key: HIVE-21964
> URL: https://issues.apache.org/jira/browse/HIVE-21964
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 3.1.1
>Reporter: Aloys Zhang
>Priority: Major
>
> Using hive jdbc handler to query external mysql data source with type decimal 
> type,  it throws class cast Exception :
>  
> {code:java}
> 2019-07-08T11:11:50,424 ERROR [7787918f-3111-4706-a3b3-0097fa1bc117 main] 
> CliDriver: Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2691)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: java.math.BigDecimal cannot be cast to 
> org.apache.hadoop.hive.common.type.HiveDecimal
> at 
> org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:98)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> ... 14 more
> Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast 
> to org.apache.hadoop.hive.common.type.HiveDecimal
> at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaHiveDecimalObjectInspector.getPrimitiveJavaObject(JavaHiveDecimalObjectInspector.java:55)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:329)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
> at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
> at 
>

[jira] [Work logged] (HIVE-24179) Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement

2020-09-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24179?focusedWorklogId=486849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486849
 ]

ASF GitHub Bot logged work on HIVE-24179:
-

Author: ASF GitHub Bot
Created on: 21/Sep/20 09:58
Start Date: 21/Sep/20 09:58
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1509:
URL: https://github.com/apache/hive/pull/1509#discussion_r491922128



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/lock/show/ShowDbLocksAnalyzer.java
##
@@ -47,14 +47,16 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String dbName = stripQuotes(root.getChild(0).getText());
 boolean isExtended = (root.getChildCount() > 1);
 
-HiveTxnManager txnManager = null;
+boolean useNewLocksFormat;
 try {
-  txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+  HiveTxnManager txnManager = 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);

Review comment:
   Do we create txnManager instance here just to get the value of 
useNewLocksFormat flag?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 486849)
Time Spent: 20m  (was: 10m)

> Memory leak in HS2 DbTxnManager when compiling SHOW LOCKS statement
> ---
>
> Key: HIVE-24179
> URL: https://issues.apache.org/jira/browse/HIVE-24179
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: summary.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The problem can be reproduced by executing repeatedly a SHOW LOCK statement 
> and monitoring the heap memory of HS2. For a small heap (e.g., 2g) it only 
> takes a few minutes before the server crashes with OutOfMemory error such as 
> the one shown below.
> {noformat}
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at java.util.Arrays.copyOf(Arrays.java:3332)
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.encodeMessage(ForkedChannelEncoder.j
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.setOutErr(ForkedChannelEncoder.java:
> at 
> org.apache.maven.surefire.booter.ForkedChannelEncoder.stdErr(ForkedChannelEncoder.java:166
> at 
> org.apache.maven.surefire.booter.ForkingRunListener.writeTestOutput(ForkingRunListener.jav
> at 
> org.apache.maven.surefire.report.ConsoleOutputCapture$ForwardingPrintStream.write(ConsoleO
> at 
> org.apache.logging.log4j.core.util.CloseShieldOutputStream.write(CloseShieldOutputStream.j
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.writeToDestination(OutputStream
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flushBuffer(OutputStreamManager
> at 
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(Abst
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutp
> at 
> org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputS
> at 
> org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:12
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(Appender
> at 
> org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:543)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:502)
> at 
> org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:485)
> at 
>

[jira] [Resolved] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak

2020-09-21 Thread kongxianghe (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kongxianghe resolved HIVE-17462.

Fix Version/s: 1.2.1
   Resolution: Won't Fix

won‘t fix.

> hive_1.2.1  hiveserver2  memory leak
> 
>
> Key: HIVE-17462
> URL: https://issues.apache.org/jira/browse/HIVE-17462
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
> Environment: hive  version  1.2.1
>Reporter: gehaijiang
>Assignee: kongxianghe
>Priority: Major
> Fix For: 1.2.1
>
>
> hiveserver2  memory leak
> hive  use  third UDF  （vs-1.0.2-SNAPSHOT.jar ， 
> alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar  . and so on )
> lr-x-- 1 data data 64 Sep  5 18:37 964 -> 
> /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar 
> (deleted)
> lr-x-- 1 data data 64 Sep  6 10:41 965 -> 
> /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 97 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar
> lrwx-- 1 data data 64 Sep  5 18:37 975 -> socket:[1318353317]
> lr-x-- 1 data data 64 Sep  6 02:38 977 -> 
> /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 98 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar
> lrwx-- 1 data data 64 Sep  6 08:40 983 -> socket:[1299459344]
> lr-x-- 1 data data 64 Sep  5 19:37 987 -> 
> /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 99 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar
> lr-x-- 1 data data 64 Sep  6 08:40 994 -> 
> /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 06:39 996 -> 
> /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  5 17:36 999 -> 
> /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 20084 data  20   0 13.6g  11g 533m S 62.3  9.2   6619:16 java
> /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak

2020-09-21 Thread kongxianghe (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kongxianghe reassigned HIVE-17462:
--

Assignee: kongxianghe

> hive_1.2.1  hiveserver2  memory leak
> 
>
> Key: HIVE-17462
> URL: https://issues.apache.org/jira/browse/HIVE-17462
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
> Environment: hive  version  1.2.1
>Reporter: gehaijiang
>Assignee: kongxianghe
>Priority: Major
>
> hiveserver2  memory leak
> hive  use  third UDF  （vs-1.0.2-SNAPSHOT.jar ， 
> alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar  . and so on )
> lr-x-- 1 data data 64 Sep  5 18:37 964 -> 
> /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar 
> (deleted)
> lr-x-- 1 data data 64 Sep  6 10:41 965 -> 
> /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 97 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar
> lrwx-- 1 data data 64 Sep  5 18:37 975 -> socket:[1318353317]
> lr-x-- 1 data data 64 Sep  6 02:38 977 -> 
> /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 98 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar
> lrwx-- 1 data data 64 Sep  6 08:40 983 -> socket:[1299459344]
> lr-x-- 1 data data 64 Sep  5 19:37 987 -> 
> /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 99 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar
> lr-x-- 1 data data 64 Sep  6 08:40 994 -> 
> /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 06:39 996 -> 
> /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  5 17:36 999 -> 
> /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 20084 data  20   0 13.6g  11g 533m S 62.3  9.2   6619:16 java
> /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak

2020-09-21 Thread kongxianghe (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199204#comment-17199204
 ] 

kongxianghe edited comment on HIVE-17462 at 9/21/20, 7:13 AM:
--

was fixed in https://issues.apache.org/jira/browse/HIVE-10453
you may use hdp-1.2.1000.2.6.1.0 or some other versions,this patch may be not 
added into it.
decompile your hive-exec-1.2.1xxx.jar  and find class SessionState.class .

{code}
public void close() throws IOException {
registry.clear();;
registry.clear();
if (txnMgr != null) txnMgr.closeTxnManager();
JavaUtils.closeClassLoadersTo(conf.getClassLoader(), parentLoader);
File resourceDir =
@@ -1493,7 +1493,7 @@ public class SessionState {
sparkSession = null;
  }
}

// this line might be lost!!
registry.closeCUDFLoaders();
dropSessionPaths(conf);
  }
{code}
registry.closeCUDFLoaders();
in hdp 2.6.1  hive-1.2.1  this line is missing and that might cause this 
problem .



was (Author: kongxianghe):
was fixed in https://issues.apache.org/jira/browse/HIVE-10453
you may use hdp-1.2.1000.2.6.1.0 or some other versions,this patch may be not 
added into it.
decompile your hive-exec-1.2.1xxx.jar  and find class SessionState.class .

{code}
public void close() throws IOException {
registry.clear();;
registry.clear();
if (txnMgr != null) txnMgr.closeTxnManager();
JavaUtils.closeClassLoadersTo(conf.getClassLoader(), parentLoader);
File resourceDir =
@@ -1493,7 +1493,7 @@ public class SessionState {
sparkSession = null;
  }
}

// this line might be lost!!
registry.closeCUDFLoaders();
dropSessionPaths(conf);
  }
{code}
registry.closeCUDFLoaders();
in hdp 2.6.1  hive-1.2.1  this line is missing,might clause this problem 

> hive_1.2.1  hiveserver2  memory leak
> 
>
> Key: HIVE-17462
> URL: https://issues.apache.org/jira/browse/HIVE-17462
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
> Environment: hive  version  1.2.1
>Reporter: gehaijiang
>Priority: Major
>
> hiveserver2  memory leak
> hive  use  third UDF  （vs-1.0.2-SNAPSHOT.jar ， 
> alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar  . and so on )
> lr-x-- 1 data data 64 Sep  5 18:37 964 -> 
> /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar 
> (deleted)
> lr-x-- 1 data data 64 Sep  6 10:41 965 -> 
> /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 97 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar
> lrwx-- 1 data data 64 Sep  5 18:37 975 -> socket:[1318353317]
> lr-x-- 1 data data 64 Sep  6 02:38 977 -> 
> /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 98 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar
> lrwx-- 1 data data 64 Sep  6 08:40 983 -> socket:[1299459344]
> lr-x-- 1 data data 64 Sep  5 19:37 987 -> 
> /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 99 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar
> lr-x-- 1 data data 64 Sep  6 08:40 994 -> 
> /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 06:39 996 -> 
> /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  5 17:36 999 -> 
> /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 20084 data  20   0 13.6g  11g 533m S 62.3  9.2   6619:16 java
> /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-17462) hive_1.2.1 hiveserver2 memory leak

2020-09-21 Thread kongxianghe (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199204#comment-17199204
 ] 

kongxianghe commented on HIVE-17462:


was fixed in https://issues.apache.org/jira/browse/HIVE-10453
you may use hdp-1.2.1000.2.6.1.0 or some other versions,this patch may be not 
added into it.
decompile your hive-exec-1.2.1xxx.jar  and find class SessionState.class .

{code}
public void close() throws IOException {
registry.clear();;
registry.clear();
if (txnMgr != null) txnMgr.closeTxnManager();
JavaUtils.closeClassLoadersTo(conf.getClassLoader(), parentLoader);
File resourceDir =
@@ -1493,7 +1493,7 @@ public class SessionState {
sparkSession = null;
  }
}

// this line might be lost!!
registry.closeCUDFLoaders();
dropSessionPaths(conf);
  }
{code}
registry.closeCUDFLoaders();
in hdp 2.6.1  hive-1.2.1  this line is missing,might clause this problem 

> hive_1.2.1  hiveserver2  memory leak
> 
>
> Key: HIVE-17462
> URL: https://issues.apache.org/jira/browse/HIVE-17462
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
> Environment: hive  version  1.2.1
>Reporter: gehaijiang
>Priority: Major
>
> hiveserver2  memory leak
> hive  use  third UDF  （vs-1.0.2-SNAPSHOT.jar ， 
> alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar  . and so on )
> lr-x-- 1 data data 64 Sep  5 18:37 964 -> 
> /tmp/9e38cc04-5693-474b-9c7d-bfdd978bcbb4_resources/vs-1.0.2-SNAPSHOT.jar 
> (deleted)
> lr-x-- 1 data data 64 Sep  6 10:41 965 -> 
> /tmp/188bbf2a-d8a5-48a7-81fc-b807f9ff201d_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 97 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/jsr305-3.0.0.jar
> lrwx-- 1 data data 64 Sep  5 18:37 975 -> socket:[1318353317]
> lr-x-- 1 data data 64 Sep  6 02:38 977 -> 
> /tmp/64e309dc-352f-4ba4-b871-1aa78fe05945_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 98 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar
> lrwx-- 1 data data 64 Sep  6 08:40 983 -> socket:[1299459344]
> lr-x-- 1 data data 64 Sep  5 19:37 987 -> 
> /tmp/c3054987-c9c6-468a-8b5c-6e20b1972e0b_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 17:41 99 -> 
> /home/data/programs/hadoop-2.7.1/share/hadoop/hdfs/lib/guava-11.0.2.jar
> lr-x-- 1 data data 64 Sep  6 08:40 994 -> 
> /tmp/fc5c44b3-9bd8-4a32-a39a-66cd44032fee_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  6 06:39 996 -> 
> /tmp/3b3c2bd6-0a0e-4599-b757-4a048a968457_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
> lr-x-- 1 data data 64 Sep  5 17:36 999 -> 
> /tmp/6ad76494-cdda-430b-b7d0-2213731655a8_resources/alogdata-1.0.3-SNAPSHOT-jar-with-dependencies.jar
>  (deleted)
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 20084 data  20   0 13.6g  11g 533m S 62.3  9.2   6619:16 java
> /home/data/programs/jdk/jdk-current/bin/java-Djava.net.preferIPv4Stack=true-Dhadoop.log.dir=/home/data/hadoop/logs-Dhadoop.log.file=hadoop.log-Dhadoop.home.dir=/home/data/programs/hadoop-2.7.1-Dhadoop.id.str=data-Dhadoop.root.logger=INFO,DRFA-Djava.library.path=/home/data/programs/hadoop-2.7.1/lib/native-Dhadoop.policy.file=hadoop-policy.xml-Djava.net.preferIPv4Stack=true-XX:+UseConcMarkSweepGC-Xms8g-Xmx8g-Dhadoop.security.logger=INFO,NullAppenderorg.apache.hadoop.util.RunJar/home/data/programs/hive-current/lib/hive-service-1.2.1.jarorg.apache.hive.service.server.HiveServer2--hiveconfhive.log.file=hiveserver2.log



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24170) Add UDF resources explicitly to the classpath while handling drop function event during load.

2020-09-21 Thread Anishek Agarwal (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24170:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master, Thanks for the patch [~pkumarsinha] and review [~aasha]

> Add UDF resources explicitly to the classpath while handling drop function 
> event during load.
> -
>
> Key: HIVE-24170
> URL: https://issues.apache.org/jira/browse/HIVE-24170
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24170.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

84 matches

Mail list logo