date:20201009

[jira] [Commented] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Pravin Sinha (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211557#comment-17211557
 ] 

Pravin Sinha commented on HIVE-24254:
-

+1

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, 
> HIVE-24254.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Status: In Progress  (was: Patch Available)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, 
> HIVE-24254.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Attachment: HIVE-24254.03.patch
Status: Patch Available  (was: In Progress)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, 
> HIVE-24254.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Status: In Progress  (was: Patch Available)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Attachment: HIVE-24254.02.patch
Status: Patch Available  (was: In Progress)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?focusedWorklogId=498829&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498829
 ]

ASF GitHub Bot logged work on HIVE-24254:
-

Author: ASF GitHub Bot
Created on: 10/Oct/20 02:41
Start Date: 10/Oct/20 02:41
Worklog Time Spent: 10m 
  Work Description: aasha commented on pull request #1567:
URL: https://github.com/apache/hive/pull/1567#issuecomment-706472758


   Please find the response 
   a) Added a test. To enable a successful recycle we need acl set up or 
disable permission check for write operation. I have done the later as the acl 
set up was not working with the available apis. 
   dfs.permissions.enabled = true : Regardless of whether permissions are on or 
off, chmod, chgrp, chown and setfacl always check permissions. So setOwner call 
was failing without the patch but move was successful as anyone can write to 
HDFS.
   b) Yes added as part of the same test. Along with this a CM Clearer test is 
also added.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498829)
Time Spent: 20m  (was: 10m)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24004) Improve performance for filter hook for superuser path

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24004?focusedWorklogId=498816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498816
 ]

ASF GitHub Bot logged work on HIVE-24004:
-

Author: ASF GitHub Bot
Created on: 10/Oct/20 00:53
Start Date: 10/Oct/20 00:53
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1373:
URL: https://github.com/apache/hive/pull/1373#issuecomment-706458124


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498816)
Time Spent: 1h  (was: 50m)

> Improve performance for filter hook for superuser path
> --
>
> Key: HIVE-24004
> URL: https://issues.apache.org/jira/browse/HIVE-24004
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Sam An
>Assignee: Sam An
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In HiveMetastoreAuthoriver, the sequence of creating the authorizer can be 
> optimized so that for super user when we can skip authorization, we can skip 
> creating authorizer. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-8950) Add support in ParquetHiveSerde to create table schema from a parquet file

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-8950?focusedWorklogId=498812&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498812
 ]

ASF GitHub Bot logged work on HIVE-8950:


Author: ASF GitHub Bot
Created on: 10/Oct/20 00:53
Start Date: 10/Oct/20 00:53
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1353:
URL: https://github.com/apache/hive/pull/1353


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498812)
Time Spent: 0.5h  (was: 20m)

> Add support in ParquetHiveSerde to create table schema from a parquet file
> --
>
> Key: HIVE-8950
> URL: https://issues.apache.org/jira/browse/HIVE-8950
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashish Singh
>Assignee: Ashish Singh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-8950.1.patch, HIVE-8950.10.patch, 
> HIVE-8950.11.patch, HIVE-8950.2.patch, HIVE-8950.3.patch, HIVE-8950.4.patch, 
> HIVE-8950.5.patch, HIVE-8950.6.patch, HIVE-8950.7.patch, HIVE-8950.8.patch, 
> HIVE-8950.9.patch, HIVE-8950.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> PARQUET-76 and PARQUET-47 ask for creating parquet backed tables without 
> having to specify the column names and types. As, parquet files store schema 
> in their footer, it is possible to generate hive schema from parquet file's 
> metadata. This will improve usability of parquet backed tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23977) Consolidate partition fetch to one place

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23977?focusedWorklogId=498813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498813
 ]

ASF GitHub Bot logged work on HIVE-23977:
-

Author: ASF GitHub Bot
Created on: 10/Oct/20 00:53
Start Date: 10/Oct/20 00:53
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1354:
URL: https://github.com/apache/hive/pull/1354


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498813)
Time Spent: 0.5h  (was: 20m)

> Consolidate partition fetch to one place
> 
>
> Key: HIVE-23977
> URL: https://issues.apache.org/jira/browse/HIVE-23977
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23960) Partition with no column statistics leads to unbalanced calls to openTransaction/commitTransaction error during get_partitions_by_names

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23960?focusedWorklogId=498815&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498815
 ]

ASF GitHub Bot logged work on HIVE-23960:
-

Author: ASF GitHub Bot
Created on: 10/Oct/20 00:53
Start Date: 10/Oct/20 00:53
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1343:
URL: https://github.com/apache/hive/pull/1343#issuecomment-706458147


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498815)
Time Spent: 50m  (was: 40m)

> Partition with no column statistics leads to unbalanced calls to 
> openTransaction/commitTransaction error during get_partitions_by_names
> ---
>
> Key: HIVE-23960
> URL: https://issues.apache.org/jira/browse/HIVE-23960
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23960.01.patch, HIVE-23960.02.patch, 
> HIVE-23960.03.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {color:#172b4d}Creating a partition with data and adding another partition is 
> leading to unbalanced calls to open/commit transaction during 
> get_partitions_by_names call.{color}
> {color:#172b4d}Issue was discovered during REPL DUMP operation which uses  
> this HMS call to get the metadata of partition. This error occurs when there 
> is a partition with no column statistics.{color}
> {color:#172b4d}To reproduce:{color}
> {code:java}
> CREATE TABLE student_part_acid(name string, age int, gpa double) PARTITIONED 
> BY (ds string) STORED AS orc;
> LOAD DATA INPATH ‘/user/hive/partDir/student_part_acid/ds=20110924’ INTO 
> TABLE student_part_acid partition(ds=20110924);
> ALTER TABLE student_part_acid ADD PARTITION (ds=20110925);
> Now if we try to preform REPL DUMP it fails with this the error "Unbalanced 
> calls to open/commit transaction" on the HS2 side. 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22934) Hive server interactive log counters to error stream

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22934?focusedWorklogId=498814&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498814
 ]

ASF GitHub Bot logged work on HIVE-22934:
-

Author: ASF GitHub Bot
Created on: 10/Oct/20 00:53
Start Date: 10/Oct/20 00:53
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1200:
URL: https://github.com/apache/hive/pull/1200#issuecomment-706458154


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498814)
Time Spent: 0.5h  (was: 20m)

> Hive server interactive log counters to error stream
> 
>
> Key: HIVE-22934
> URL: https://issues.apache.org/jira/browse/HIVE-22934
> Project: Hive
>  Issue Type: Bug
>Reporter: Slim Bouguerra
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22934.01.patch, HIVE-22934.02.patch, 
> HIVE-22934.03.patch, HIVE-22934.04.patch, HIVE-22934.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive server is logging the console output to system error stream.
> This need to be fixed because 
> First we do not roll the file.
> Second writing to such file is done sequential and can lead to throttle/poor 
> perf.
> {code}
> -rw-r--r--  1 hive hadoop 9.5G Feb 26 17:22 hive-server2-interactive.err
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24255?focusedWorklogId=498780&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498780
 ]

ASF GitHub Bot logged work on HIVE-24255:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 22:58
Start Date: 09/Oct/20 22:58
Worklog Time Spent: 10m 
  Work Description: nareshpr opened a new pull request #1568:
URL: https://github.com/apache/hive/pull/1568


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498780)
Remaining Estimate: 0h
Time Spent: 10m

> StorageHandler with select-limit query is returning 0 rows
> --
>
> Key: HIVE-24255
> URL: https://issues.apache.org/jira/browse/HIVE-24255
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> CREATE EXTERNAL TABLE dbs(db_id bigint, db_location_uri string, name string, 
> owner_name string, owner_type string) STORED BY 
> 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES 
> ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, 
> `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); 
> {code}
> ==> Wrong Result <==
> {code:java}
> set hive.limit.optimize.enable=true;
> select * from dbs limit 1;
> --
>  VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --
> Map 1 .. container SUCCEEDED 0 0 0 0 0 0
> --
> VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
> --
> ++--+---+-+-+
> | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | 
> dbs.owner_type |
> ++--+---+-+-+
> ++--+---+-+-+
> {code}
> ==> Correct Result <==
> {code:java}
> set hive.limit.optimize.enable=false;
> select * from dbs limit 1;
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 .. container SUCCEEDED  1  100  
>  0   0
> --
> VERTICES: 01/01  [==>>] 100%  ELAPSED TIME: 4.11 s
> --+++---+-+-+
> | dbs.db_id  |dbs.db_location_uri | dbs.name  
> | dbs.owner_name  | dbs.owner_type  |
> +++---+-+-+
> | 1  | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default   
> | public  | ROLE|
> +++---+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24255:
--
Labels: pull-request-available  (was: )

> StorageHandler with select-limit query is returning 0 rows
> --
>
> Key: HIVE-24255
> URL: https://issues.apache.org/jira/browse/HIVE-24255
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> CREATE EXTERNAL TABLE dbs(db_id bigint, db_location_uri string, name string, 
> owner_name string, owner_type string) STORED BY 
> 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES 
> ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, 
> `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); 
> {code}
> ==> Wrong Result <==
> {code:java}
> set hive.limit.optimize.enable=true;
> select * from dbs limit 1;
> --
>  VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --
> Map 1 .. container SUCCEEDED 0 0 0 0 0 0
> --
> VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
> --
> ++--+---+-+-+
> | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | 
> dbs.owner_type |
> ++--+---+-+-+
> ++--+---+-+-+
> {code}
> ==> Correct Result <==
> {code:java}
> set hive.limit.optimize.enable=false;
> select * from dbs limit 1;
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED
> --
> Map 1 .. container SUCCEEDED  1  100  
>  0   0
> --
> VERTICES: 01/01  [==>>] 100%  ELAPSED TIME: 4.11 s
> --+++---+-+-+
> | dbs.db_id  |dbs.db_location_uri | dbs.name  
> | dbs.owner_name  | dbs.owner_type  |
> +++---+-+-+
> | 1  | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default   
> | public  | ROLE|
> +++---+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24120) Plugin for external DatabaseProduct in standalone HMS

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24120?focusedWorklogId=498767&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498767
 ]

ASF GitHub Bot logged work on HIVE-24120:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 22:26
Start Date: 09/Oct/20 22:26
Worklog Time Spent: 10m 
  Work Description: gatorblue commented on a change in pull request #1470:
URL: https://github.com/apache/hive/pull/1470#discussion_r502691897



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java
##
@@ -20,71 +20,666 @@
 
 import java.sql.SQLException;
 import java.sql.SQLTransactionRollbackException;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.EnumMap;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
 
-/** Database product infered via JDBC. */
-public enum DatabaseProduct {
-  DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, OTHER;
+import org.apache.hadoop.conf.Configurable;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars;
+import org.apache.hadoop.util.ReflectionUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
+import com.google.common.base.Preconditions;
+
+/** Database product inferred via JDBC. Encapsulates all SQL logic associated 
with
+ * the database product.
+ * This class is a singleton, which is instantiated the first time
+ * method determineDatabaseProduct is invoked.
+ * Tests that need to create multiple instances can use the reset method
+ * */
+public class DatabaseProduct implements Configurable {
+  static final private Logger LOG = 
LoggerFactory.getLogger(DatabaseProduct.class.getName());
+
+  private static enum DbType {DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, 
CUSTOM, UNDEFINED};
+  public DbType dbType;
+  
+  // Singleton instance
+  private static DatabaseProduct theDatabaseProduct;
+
+  Configuration myConf;
+  /**
+   * Protected constructor for singleton class
+   * @param id
+   */
+  protected DatabaseProduct() {}
+
+  public static final String DERBY_NAME = "derby";
+  public static final String SQL_SERVER_NAME = "microsoft sql server";
+  public static final String MYSQL_NAME = "mysql";
+  public static final String POSTGRESQL_NAME = "postgresql";
+  public static final String ORACLE_NAME = "oracle";
+  public static final String UNDEFINED_NAME = "other";
+  
   /**
* Determine the database product type
* @param productName string to defer database connection
* @return database product type
*/
-  public static DatabaseProduct determineDatabaseProduct(String productName) 
throws SQLException {
-if (productName == null) {
-  return OTHER;
+  public static DatabaseProduct determineDatabaseProduct(String productName, 
Configuration c) {
+DbType dbt;
+
+if (theDatabaseProduct != null) {
+  Preconditions.checkState(theDatabaseProduct.dbType == 
getDbType(productName));
+  return theDatabaseProduct;
 }
+
+// This method may be invoked by concurrent connections
+synchronized (DatabaseProduct.class) {
+
+  if (productName == null) {
+productName = UNDEFINED_NAME;
+  }
+
+  dbt = getDbType(productName);
+
+  // Check for null again in case of race condition
+  if (theDatabaseProduct == null) {
+final Configuration conf = c!= null ? c : 
MetastoreConf.newMetastoreConf();
+// Check if we are using an external database product
+boolean isExternal = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_CUSTOM_RDBMS);
+
+if (isExternal) {
+  // The DatabaseProduct will be created by instantiating an external 
class via
+  // reflection. The external class can override any method in the 
current class
+  String className = MetastoreConf.getVar(conf, 
ConfVars.CUSTOM_RDBMS_CLASSNAME);
+  
+  if (className != null) {
+try {
+  theDatabaseProduct = (DatabaseProduct)
+  ReflectionUtils.newInstance(Class.forName(className), conf);
+  
+  LOG.info(String.format("Using custom RDBMS %s. Overriding 
DbType: %s", className, dbt));
+  dbt = DbType.CUSTOM;
+}catch (Exception e) {
+  LOG.warn("Caught exception instantiating custom database 
product. Reverting to " + dbt, e);
+}
+  }
+  else {
+LOG.warn("Unexpected: metastore.use.custom.database.product was 
set, " +
+ "but metastore.custom.database.product.classname was not. 
Reverting to " + dbt);
+  }
+}
+
+if (theDatabaseProduct == null) {
+  theDatabaseProduct = new DatabaseProduct();

Re

[jira] [Work logged] (HIVE-24120) Plugin for external DatabaseProduct in standalone HMS

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24120?focusedWorklogId=498761&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498761
 ]

ASF GitHub Bot logged work on HIVE-24120:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 22:19
Start Date: 09/Oct/20 22:19
Worklog Time Spent: 10m 
  Work Description: gatorblue commented on a change in pull request #1470:
URL: https://github.com/apache/hive/pull/1470#discussion_r502689641



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java
##
@@ -20,71 +20,666 @@
 
 import java.sql.SQLException;
 import java.sql.SQLTransactionRollbackException;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.EnumMap;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
 
-/** Database product infered via JDBC. */
-public enum DatabaseProduct {
-  DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, OTHER;
+import org.apache.hadoop.conf.Configurable;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf;
+import org.apache.hadoop.hive.metastore.conf.MetastoreConf.ConfVars;
+import org.apache.hadoop.util.ReflectionUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
+import com.google.common.base.Preconditions;
+
+/** Database product inferred via JDBC. Encapsulates all SQL logic associated 
with
+ * the database product.
+ * This class is a singleton, which is instantiated the first time
+ * method determineDatabaseProduct is invoked.
+ * Tests that need to create multiple instances can use the reset method
+ * */
+public class DatabaseProduct implements Configurable {
+  static final private Logger LOG = 
LoggerFactory.getLogger(DatabaseProduct.class.getName());
+
+  private static enum DbType {DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, 
CUSTOM, UNDEFINED};
+  public DbType dbType;
+  
+  // Singleton instance
+  private static DatabaseProduct theDatabaseProduct;
+
+  Configuration myConf;
+  /**
+   * Protected constructor for singleton class
+   * @param id
+   */
+  protected DatabaseProduct() {}
+
+  public static final String DERBY_NAME = "derby";
+  public static final String SQL_SERVER_NAME = "microsoft sql server";
+  public static final String MYSQL_NAME = "mysql";
+  public static final String POSTGRESQL_NAME = "postgresql";
+  public static final String ORACLE_NAME = "oracle";
+  public static final String UNDEFINED_NAME = "other";
+  
   /**
* Determine the database product type
* @param productName string to defer database connection
* @return database product type
*/
-  public static DatabaseProduct determineDatabaseProduct(String productName) 
throws SQLException {
-if (productName == null) {
-  return OTHER;
+  public static DatabaseProduct determineDatabaseProduct(String productName, 
Configuration c) {
+DbType dbt;
+
+if (theDatabaseProduct != null) {
+  Preconditions.checkState(theDatabaseProduct.dbType == 
getDbType(productName));
+  return theDatabaseProduct;
 }
+
+// This method may be invoked by concurrent connections
+synchronized (DatabaseProduct.class) {
+
+  if (productName == null) {
+productName = UNDEFINED_NAME;
+  }
+
+  dbt = getDbType(productName);
+
+  // Check for null again in case of race condition
+  if (theDatabaseProduct == null) {
+final Configuration conf = c!= null ? c : 
MetastoreConf.newMetastoreConf();
+// Check if we are using an external database product
+boolean isExternal = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_CUSTOM_RDBMS);
+
+if (isExternal) {
+  // The DatabaseProduct will be created by instantiating an external 
class via
+  // reflection. The external class can override any method in the 
current class
+  String className = MetastoreConf.getVar(conf, 
ConfVars.CUSTOM_RDBMS_CLASSNAME);
+  
+  if (className != null) {
+try {
+  theDatabaseProduct = (DatabaseProduct)
+  ReflectionUtils.newInstance(Class.forName(className), conf);
+  
+  LOG.info(String.format("Using custom RDBMS %s. Overriding 
DbType: %s", className, dbt));
+  dbt = DbType.CUSTOM;
+}catch (Exception e) {
+  LOG.warn("Caught exception instantiating custom database 
product. Reverting to " + dbt, e);

Review comment:
   I changed the code to throw a RuntimeException instead. This method is 
called in a few places where it's not clear what to do with a regular 
Exception. Hope this makes sense.





This is an automated message from the Apache Git Service.
To respond to the message, please lo

[jira] [Updated] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows

2020-10-09 Thread Naresh P R (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-24255:
--
Description: 
 
{code:java}
CREATE EXTERNAL TABLE dbs(db_id bigint, db_location_uri string, name string, 
owner_name string, owner_type string) STORED BY 
'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES 
('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT `DB_ID`, 
`DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`'); 
{code}
==> Wrong Result <==
{code:java}
set hive.limit.optimize.enable=true;
select * from dbs limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 0 0 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
--
++--+---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
++--+---+-+-+
++--+---+-+-+
{code}
==> Correct Result <==
{code:java}
set hive.limit.optimize.enable=false;
select * from dbs limit 1;
--
VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
FAILED  KILLED
--
Map 1 .. container SUCCEEDED  1  100
   0   0
--
VERTICES: 01/01  [==>>] 100%  ELAPSED TIME: 4.11 s
--+++---+-+-+
| dbs.db_id  |dbs.db_location_uri | dbs.name  | 
dbs.owner_name  | dbs.owner_type  |
+++---+-+-+
| 1  | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default   | 
public  | ROLE|
+++---+-+-+{code}

  was:
 
{code:java}
CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name 
string, owner_name string, owner_type string)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT 
`DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`');
==> Wrong Result <==
set hive.limit.optimize.enable=true;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 0 0 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
--
++--+---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
++--+---+-+-+
++--+---+-+-+
==> Correct Result <==
set hive.limit.optimize.enable=false;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 1 1 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s
--
++--

[jira] [Updated] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows

2020-10-09 Thread Naresh P R (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-24255:
--
Description: 
 
{code:java}
CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name 
string, owner_name string, owner_type string)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT 
`DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`');
==> Wrong Result <==
set hive.limit.optimize.enable=true;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 0 0 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
--
++--+---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
++--+---+-+-+
++--+---+-+-+
==> Correct Result <==
set hive.limit.optimize.enable=false;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 1 1 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s
--
+++---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
+++---+-+-+
| 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | 
ROLE |
-
{code}
 

  was:
 
{code:java}
CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name 
string, owner_name string, owner_type string)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT 
`DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`');
==> Wrong Result <==
set hive.limit.optimize.enable=true;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 0 0 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
--
++--+---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
++--+---+-+-+
++--+---+-+-+
==> Correct Result <==
set hive.limit.optimize.enable=false;
select * from test_table limit 1;
--
 VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--
Map 1 .. container SUCCEEDED 1 1 0 0 0 0
--
VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s
--
+++---+-+-+
| dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | dbs.owner_type |
++---

[jira] [Assigned] (HIVE-24255) StorageHandler with select-limit query is returning 0 rows

2020-10-09 Thread Naresh P R (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R reassigned HIVE-24255:
-


> StorageHandler with select-limit query is returning 0 rows
> --
>
> Key: HIVE-24255
> URL: https://issues.apache.org/jira/browse/HIVE-24255
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>
>  
> {code:java}
> CREATE EXTERNAL TABLE test_table(db_id bigint, db_location_uri string, name 
> string, owner_name string, owner_type string)
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES ('hive.sql.database.type'='METASTORE', 'hive.sql.query'='SELECT 
> `DB_ID`, `DB_LOCATION_URI`, `NAME`, `OWNER_NAME`, `OWNER_TYPE` FROM `DBS`');
> ==> Wrong Result <==
> set hive.limit.optimize.enable=true;
> select * from test_table limit 1;
> --
>  VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --
> Map 1 .. container SUCCEEDED 0 0 0 0 0 0
> --
> VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 0.91 s
> --
> ++--+---+-+-+
> | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | 
> dbs.owner_type |
> ++--+---+-+-+
> ++--+---+-+-+
> ==> Correct Result <==
> set hive.limit.optimize.enable=false;
> select * from test_table limit 1;
> --
>  VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --
> Map 1 .. container SUCCEEDED 1 1 0 0 0 0
> --
> VERTICES: 01/01 [==>>] 100% ELAPSED TIME: 4.11 s
> --
> +++---+-+-+
> | dbs.db_id | dbs.db_location_uri | dbs.name | dbs.owner_name | 
> dbs.owner_type |
> +++---+-+-+
> | 1 | hdfs://abcd:8020/warehouse/tablespace/managed/hive | default | public | 
> ROLE |
> {code}
> +++---+-+-+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque

2020-10-09 Thread Cole Mackenzie (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211384#comment-17211384
 ] 

Cole Mackenzie commented on HIVE-23700:
---

I am also receiving this error with the following dependencies and building a 
FAT jar. 
{code:java}

  org.apache.spark
  spark-sql_2.12
  3.0.1



  org.apache.spark
  spark-hive_2.12
  3.0.1
 {code}
 

Stacktrace:
{code:java}
Caused by: java.lang.IllegalArgumentException: URI is not hierarchical
at java.io.File.(File.java:420)
at 
org.apache.hadoop.hive.conf.HiveConf.findConfigFile(HiveConf.java:176)
at org.apache.hadoop.hive.conf.HiveConf.(HiveConf.java:145)
... 58 common frames omitted {code}

> HiveConf static initialization fails when JAR URI is opaque
> ---
>
> Key: HIVE-23700
> URL: https://issues.apache.org/jira/browse/HIVE-23700
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.7
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23700.1.patch
>
>   Original Estimate: 120h
>  Time Spent: 2h
>  Remaining Estimate: 118h
>
> HiveConf static initialization fails when the jar URI is opaque, for example 
> when it's embedded as a fat jar in a spring boot application. Then 
> initialization of the HiveConf static block fails and the HiveConf class does 
> not get classloaded. The opaque URI in my case looks like this 
> _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_
> HiveConf#findConfigFile should be able to handle `IllegalArgumentException` 
> when the jar `URI` provided to `File` throws the exception.
> To surface this issue three conditions need to be met.
> 1. hive-site.xml should not be on the classpath
> 2. hive-site.xml should not be on "HIVE_CONF_DIR"
> 3. hive-site.xml should not be on "HIVE_HOME"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24254:
--
Labels: pull-request-available  (was: )

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?focusedWorklogId=498713&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498713
 ]

ASF GitHub Bot logged work on HIVE-24254:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 19:32
Start Date: 09/Oct/20 19:32
Worklog Time Spent: 10m 
  Work Description: aasha opened a new pull request #1567:
URL: https://github.com/apache/hive/pull/1567


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498713)
Remaining Estimate: 0h
Time Spent: 10m

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24254.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Attachment: HIVE-24254.01.patch
Status: Patch Available  (was: In Progress)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24254.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Status: In Progress  (was: Patch Available)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Attachment: (was: HIVE-24254.01.patch)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24254:
---
Attachment: HIVE-24254.01.patch
Status: Patch Available  (was: In Progress)

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24254.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24254 started by Aasha Medhi.
--
> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24254.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24208) LLAP: query job stuck due to race conditions

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24208?focusedWorklogId=498668&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498668
 ]

ASF GitHub Bot logged work on HIVE-24208:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 17:09
Start Date: 09/Oct/20 17:09
Worklog Time Spent: 10m 
  Work Description: bymm commented on pull request #1534:
URL: https://github.com/apache/hive/pull/1534#issuecomment-706298999


   Unittest is added.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498668)
Time Spent: 0.5h  (was: 20m)

> LLAP: query job stuck due to race conditions
> 
>
> Key: HIVE-24208
> URL: https://issues.apache.org/jira/browse/HIVE-24208
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: Yuriy Baltovskyy
>Assignee: Yuriy Baltovskyy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When issuing an LLAP query, sometimes the TEZ job on LLAP server never ends 
> and it never returns the data reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24246:
--
Labels: pull-request-available  (was: )

> Fix for Ranger Deny policy overriding policy with same resource name 
> -
>
> Key: HIVE-24246
> URL: https://issues.apache.org/jira/browse/HIVE-24246
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24246.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24246:
---
Attachment: HIVE-24246.01.patch
Status: Patch Available  (was: In Progress)

> Fix for Ranger Deny policy overriding policy with same resource name 
> -
>
> Key: HIVE-24246
> URL: https://issues.apache.org/jira/browse/HIVE-24246
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24246.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24246 started by Aasha Medhi.
--
> Fix for Ranger Deny policy overriding policy with same resource name 
> -
>
> Key: HIVE-24246
> URL: https://issues.apache.org/jira/browse/HIVE-24246
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24246.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24246?focusedWorklogId=498626&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498626
 ]

ASF GitHub Bot logged work on HIVE-24246:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 15:41
Start Date: 09/Oct/20 15:41
Worklog Time Spent: 10m 
  Work Description: aasha opened a new pull request #1566:
URL: https://github.com/apache/hive/pull/1566


   …esource name
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498626)
Remaining Estimate: 0h
Time Spent: 10m

> Fix for Ranger Deny policy overriding policy with same resource name 
> -
>
> Key: HIVE-24246
> URL: https://issues.apache.org/jira/browse/HIVE-24246
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24246.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=498624&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498624
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 15:33
Start Date: 09/Oct/20 15:33
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r502512642



##
File path: standalone-metastore/metastore-server/src/main/resources/package.jdo
##
@@ -1549,6 +1549,83 @@
 
   
 
+
+
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+
+
+
+
+  
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+
+  
+
+  
+
+  

Review comment:
   
   > I'm not sure if they never participate in a query. If one wants to 
discover the stored procedures which are currently stored in a DB and find out 
on what data they operate they would need to do some clumsy string 
manipulations on the signature.
   
   I believe you are thinking about `information_schema` stuff - its not set in 
stone that we have to get all that data from the metastore db - for this case 
we might add a few UDFs parameter info into an array or something ; so we will 
still store simple things in the metastore - but we could transform it into 
more readable in. 
   
   > Considering that other DB engines also store these information separately 
I would like to keep it as it is for now and see how it works in practice. 
Later on when we have multi language support we can revisit this issue.
   
   yes it might be..but it would be better to revisit stuff like this if its 
really needed; and not after we have introduced "something" which later we 
should care for even if we don't want to
   
   I still think there will be no real benefit of "storing it" in a decomposed 
manner - it will be harder to go forward in case stuff changes - and right now 
will not use it for anything ; so let's remove it..and add it only if there is 
a real need for it.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498624)
Time Spent: 1h 40m  (was: 1.5h)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-10-09 Thread Denys Kuzmenko (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211074#comment-17211074
 ] 

Denys Kuzmenko commented on HIVE-24211:
---

Merged to master.
Thank you for the review, [~pvarga] and [~pvary]!

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-10-09 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-24211.
---
Resolution: Fixed

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?focusedWorklogId=498609&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498609
 ]

ASF GitHub Bot logged work on HIVE-24211:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 15:08
Start Date: 09/Oct/20 15:08
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #1533:
URL: https://github.com/apache/hive/pull/1533


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498609)
Time Spent: 50m  (was: 40m)

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21737) Upgrade Avro to version 1.10.0

2020-10-09 Thread Chinna Rao Lalam (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211061#comment-17211061
 ] 

Chinna Rao Lalam commented on HIVE-21737:
-

Hi [~iemejia],

Verified this patch and found these 2 test failures with below exception
{quote}avro_deserialize_map_null.q
 parquet_map_null.q
{quote}
{quote}Failed with exception 
java.io.IOException:org.apache.avro.AvroTypeException: Invalid default for 
field avreau_col_1: null not a []
{quote}
It looks these exceptions are because of breaking backword compatability of 
avro version. https://issues.apache.org/jira/browse/AVRO-2817

We tried setting *Schema.Parser.setValidateDefaults(false)* to turn of defaults 
validation

Ex. 
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils#getSchemaFor(java.io.File) it 
did not work.

[~iemejia] any idea/workarond for this issue?

> Upgrade Avro to version 1.10.0
> --
>
> Key: HIVE-21737
> URL: https://issues.apache.org/jira/browse/HIVE-21737
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Ismaël Mejía
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Attachments: 0001-HIVE-21737-Bump-Apache-Avro-to-1.9.2.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without 
> Jackson in the public API and Guava as a dependency. Worth the update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24236) Connection leak in TxnHandler

2020-10-09 Thread Yongzhi Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen resolved HIVE-24236.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Connection leak in TxnHandler
> -
>
> Key: HIVE-24236
> URL: https://issues.apache.org/jira/browse/HIVE-24236
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We see failures in QE tests with cannot allocate connections errors. The 
> exception stack like following:
> {noformat}
> 2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler 
> (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in 
> heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, 
> general error (SQLState=null, ErrorCode=0)
> 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable 
> to select from transaction database 
> org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general 
> error
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
> at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
> at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
> at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at 
> org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112)
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
> ... 29 more
> )
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
> at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
> {noformat}
> and
> {noformat}
> Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
> at 
> org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
> at

[jira] [Work started] (HIVE-24236) Connection leak in TxnHandler

2020-10-09 Thread Yongzhi Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24236 started by Yongzhi Chen.
---
> Connection leak in TxnHandler
> -
>
> Key: HIVE-24236
> URL: https://issues.apache.org/jira/browse/HIVE-24236
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We see failures in QE tests with cannot allocate connections errors. The 
> exception stack like following:
> {noformat}
> 2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler 
> (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in 
> heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, 
> general error (SQLState=null, ErrorCode=0)
> 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable 
> to select from transaction database 
> org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general 
> error
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
> at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
> at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
> at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at 
> org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112)
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
> ... 29 more
> )
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
> at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
> {noformat}
> and
> {noformat}
> Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
> at 
> org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(Po

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498535
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:15
Start Date: 09/Oct/20 14:15
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1565:
URL: https://github.com/apache/hive/pull/1565#issuecomment-706119583


   @kasakrisz @kgyrtkirk The flaky test runs successfully:
   http://ci.hive.apache.org/job/hive-flaky-check/126/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498535)
Time Spent: 1h 20m  (was: 1h 10m)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-09 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi reassigned HIVE-24254:
--


> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=498499&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498499
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:12
Start Date: 09/Oct/20 14:12
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1548:
URL: https://github.com/apache/hive/pull/1548#discussion_r501713463



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
##
@@ -853,6 +857,273 @@ public void majorCompactAfterAbort() throws Exception {
 Lists.newArrayList(5, 6), 1);
   }
 
+  @Test
+  public void testCleanAbortCompactAfterAbortTwoPartitions() throws Exception {
+String dbName = "default";
+String tblName = "cws";
+
+HiveStreamingConnection connection1 = 
prepareTableTwoPartitionsAndConnection(dbName, tblName, 1);
+HiveStreamingConnection connection2 = 
prepareTableTwoPartitionsAndConnection(dbName, tblName, 1);
+
+connection1.beginTransaction();
+connection1.write("1,1".getBytes());
+connection1.write("2,2".getBytes());
+connection1.abortTransaction();
+
+connection2.beginTransaction();
+connection2.write("1,3".getBytes());
+connection2.write("2,3".getBytes());
+connection2.write("3,3".getBytes());
+connection2.abortTransaction();
+
+assertAndCompactCleanAbort(dbName, tblName);
+
+connection1.close();
+connection2.close();
+  }
+
+  @Test
+  public void testCleanAbortCompactAfterAbort() throws Exception {
+String dbName = "default";
+String tblName = "cws";
+
+// Create three folders with two different transactions
+HiveStreamingConnection connection1 = prepareTableAndConnection(dbName, 
tblName, 1);
+HiveStreamingConnection connection2 = prepareTableAndConnection(dbName, 
tblName, 1);
+
+connection1.beginTransaction();
+connection1.write("1,1".getBytes());
+connection1.write("2,2".getBytes());
+connection1.abortTransaction();
+
+connection2.beginTransaction();
+connection2.write("1,3".getBytes());
+connection2.write("2,3".getBytes());
+connection2.write("3,3".getBytes());
+connection2.abortTransaction();
+
+assertAndCompactCleanAbort(dbName, tblName);
+
+connection1.close();
+connection2.close();
+  }
+
+  private void assertAndCompactCleanAbort(String dbName, String tblName) 
throws Exception {
+IMetaStoreClient msClient = new HiveMetaStoreClient(conf);
+TxnStore txnHandler = TxnUtils.getTxnStore(conf);
+Table table = msClient.getTable(dbName, tblName);
+FileSystem fs = FileSystem.get(conf);
+FileStatus[] stat =
+fs.listStatus(new Path(table.getSd().getLocation()));
+if (3 != stat.length) {
+  Assert.fail("Expecting three directories corresponding to three 
partitions, FileStatus[] stat " + Arrays.toString(stat));
+}
+
+int count = TxnDbUtil.countQueryAgent(conf, "select count(*) from 
TXN_COMPONENTS where TC_OPERATION_TYPE='p'");
+// We should have two rows corresponding to the two aborted transactions
+Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from 
TXN_COMPONENTS"), 2, count);
+
+runInitiator(conf);
+count = TxnDbUtil.countQueryAgent(conf, "select count(*) from 
COMPACTION_QUEUE where CQ_TYPE='p'");
+// Only one job is added to the queue per table. This job corresponds to 
all the entries for a particular table
+// with rows in TXN_COMPONENTS
+Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from 
COMPACTION_QUEUE"), 1, count);
+
+ShowCompactResponse rsp = txnHandler.showCompact(new ShowCompactRequest());
+Assert.assertEquals(1, rsp.getCompacts().size());
+Assert.assertEquals(TxnStore.CLEANING_RESPONSE, 
rsp.getCompacts().get(0).getState());
+Assert.assertEquals("cws", rsp.getCompacts().get(0).getTablename());
+Assert.assertEquals(CompactionType.CLEAN_ABORTED,
+rsp.getCompacts().get(0).getType());
+
+runCleaner(conf);
+
+// After the cleaner runs TXN_COMPONENTS and COMPACTION_QUEUE should have 
zero rows, also the folders should have been deleted.
+count = TxnDbUtil.countQueryAgent(conf, "select count(*) from 
TXN_COMPONENTS");
+Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from 
TXN_COMPONENTS"), 0, count);
+
+count = TxnDbUtil.countQueryAgent(conf, "select count(*) from 
COMPACTION_QUEUE");
+Assert.assertEquals(TxnDbUtil.queryToString(conf, "select * from 
COMPACTION_QUEUE"), 0, count);
+
+RemoteIterator it =
+fs.listFiles(new Path(table.getSd().getLocation()), true);
+if (it.hasNext()) {
+  Assert.fail("Expecting compaction to have cleaned the directories, 
FileStatus[] stat " + Arrays.toString(stat));

Review comment:
   I think this assert is quit misleading

[jira] [Work logged] (HIVE-24244) NPE during Atlas metadata replication

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24244?focusedWorklogId=498473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498473
 ]

ASF GitHub Bot logged work on HIVE-24244:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:10
Start Date: 09/Oct/20 14:10
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1563:
URL: https://github.com/apache/hive/pull/1563#discussion_r502202414



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestAtlasDumpTask.java
##
@@ -96,4 +105,13 @@ public void testAtlasDumpMetrics() throws Exception {
 Assert.assertTrue(eventDetailsCaptor
 
.getAllValues().get(1).toString().contains("{\"dbName\":\"srcDB\",\"dumpEndTime\""));
   }
+
+  @Test
+  public void testAtlasRestClientBuilder() throws SemanticException, 
IOException {
+mockStatic(UserGroupInformation.class);
+
when(UserGroupInformation.getLoginUser()).thenReturn(mock(UserGroupInformation.class));
+AtlasRestClientBuilder atlasRestCleintBuilder = new 
AtlasRestClientBuilder("http://localhost:31000";);
+AtlasRestClient atlasClient = atlasRestCleintBuilder.getClient(conf);
+Assert.assertTrue(atlasClient != null);

Review comment:
   HiveConf is mocked, so hive in test is not present(so false). 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498473)
Time Spent: 1h  (was: 50m)

> NPE during Atlas metadata replication
> -
>
> Key: HIVE-24244
> URL: https://issues.apache.org/jira/browse/HIVE-24244
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24244.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=498480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498480
 ]

ASF GitHub Bot logged work on HIVE-24106:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:10
Start Date: 09/Oct/20 14:10
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1456:
URL: https://github.com/apache/hive/pull/1456#discussion_r502273628



##
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##
@@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws 
SQLException {
 // Poll on the operation status, till the operation is complete
 do {
   try {
+if (Thread.currentThread().isInterrupted()) {
+  throw new SQLException("Interrupted while polling on the operation 
status", "70100");

Review comment:
   I think this error message and code should be placed in the `ErrorMsg` 
class





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498480)
Time Spent: 1h  (was: 50m)

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24241) Enable SharedWorkOptimizer to merge downstream operators after an optimization step

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24241?focusedWorklogId=498437&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498437
 ]

ASF GitHub Bot logged work on HIVE-24241:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:07
Start Date: 09/Oct/20 14:07
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1562:
URL: https://github.com/apache/hive/pull/1562


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498437)
Time Spent: 20m  (was: 10m)

> Enable SharedWorkOptimizer to merge downstream operators after an 
> optimization step
> ---
>
> Key: HIVE-24241
> URL: https://issues.apache.org/jira/browse/HIVE-24241
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24224) Fix skipping header/footer for Hive on Tez on compressed files

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24224?focusedWorklogId=498381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498381
 ]

ASF GitHub Bot logged work on HIVE-24224:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:02
Start Date: 09/Oct/20 14:02
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk closed pull request #1546:
URL: https://github.com/apache/hive/pull/1546


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498381)
Time Spent: 50m  (was: 40m)

> Fix skipping header/footer for Hive on Tez on compressed files
> --
>
> Key: HIVE-24224
> URL: https://issues.apache.org/jira/browse/HIVE-24224
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Compressed file with Hive on Tez  returns header and footers - for both 
> select * and select count ( * ):
> {noformat}
> printf "offset,id,other\n9,\"20200315 X00 1356\",123\n17,\"20200315 X00 
> 1357\",123\nrst,rst,rst" > data.csv
> hdfs dfs -put -f data.csv /apps/hive/warehouse/bz2test/bz2tbl1/
> bzip2 -f data.csv 
> hdfs dfs -put -f data.csv.bz2 /apps/hive/warehouse/bz2test/bz2tbl2/
> beeline -e "CREATE EXTERNAL TABLE default.bz2tst2 (
>   sequence   int,
>   id string,
>   other  string) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
> LOCATION '/apps/hive/warehouse/bz2test/bz2tbl2' 
> TBLPROPERTIES (
>   'skip.header.line.count'='1',
>   'skip.footer.line.count'='1');"
> beeline -e "
>   SET hive.fetch.task.conversion = none;
>   SELECT * FROM default.bz2tst2;"
> +---+++
> | bz2tst2.sequence  | bz2tst2.id | bz2tst2.other  |
> +---+++
> | offset| id | other  |
> | 9 | 20200315 X00 1356  | 123|
> | 17| 20200315 X00 1357  | 123|
> | rst   | rst| rst|
> +---+++
> {noformat}
> PS: HIVE-22769 addressed the issue for Hive on LLAP.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24244) NPE during Atlas metadata replication

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24244?focusedWorklogId=498364&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498364
 ]

ASF GitHub Bot logged work on HIVE-24244:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 14:01
Start Date: 09/Oct/20 14:01
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #1563:
URL: https://github.com/apache/hive/pull/1563#discussion_r502163365



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestAtlasDumpTask.java
##
@@ -96,4 +105,13 @@ public void testAtlasDumpMetrics() throws Exception {
 Assert.assertTrue(eventDetailsCaptor
 
.getAllValues().get(1).toString().contains("{\"dbName\":\"srcDB\",\"dumpEndTime\""));
   }
+
+  @Test
+  public void testAtlasRestClientBuilder() throws SemanticException, 
IOException {
+mockStatic(UserGroupInformation.class);
+
when(UserGroupInformation.getLoginUser()).thenReturn(mock(UserGroupInformation.class));
+AtlasRestClientBuilder atlasRestCleintBuilder = new 
AtlasRestClientBuilder("http://localhost:31000";);
+AtlasRestClient atlasClient = atlasRestCleintBuilder.getClient(conf);
+Assert.assertTrue(atlasClient != null);

Review comment:
   hive in test repl is set to true. It will return a No Op Client which 
will never be null.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498364)
Time Spent: 50m  (was: 40m)

> NPE during Atlas metadata replication
> -
>
> Key: HIVE-24244
> URL: https://issues.apache.org/jira/browse/HIVE-24244
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24244.01.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=498292&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498292
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:55
Start Date: 09/Oct/20 13:55
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1271:
URL: https://github.com/apache/hive/pull/1271


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498292)
Time Spent: 5.5h  (was: 5h 20m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52
>  ) hence the query fails with Failed to deseriali

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498282&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498282
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:54
Start Date: 09/Oct/20 13:54
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1565:
URL: https://github.com/apache/hive/pull/1565


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498282)
Time Spent: 1h 10m  (was: 1h)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=498239&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498239
 ]

ASF GitHub Bot logged work on HIVE-24106:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:51
Start Date: 09/Oct/20 13:51
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on a change in pull request #1456:
URL: https://github.com/apache/hive/pull/1456#discussion_r502403149



##
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##
@@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws 
SQLException {
 // Poll on the operation status, till the operation is complete
 do {
   try {
+if (Thread.currentThread().isInterrupted()) {
+  throw new SQLException("Interrupted while polling on the operation 
status", "70100");

Review comment:
   Done, thank you very much! @kgyrtkirk 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498239)
Time Spent: 50m  (was: 40m)

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=498199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498199
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:48
Start Date: 09/Oct/20 13:48
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on pull request #1548:
URL: https://github.com/apache/hive/pull/1548#issuecomment-705545322


   looks like master is broken right now



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498199)
Time Spent: 10h 40m  (was: 10.5h)

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21611) Date.getTime() can be changed to System.currentTimeMillis()

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21611?focusedWorklogId=498188&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498188
 ]

ASF GitHub Bot logged work on HIVE-21611:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:47
Start Date: 09/Oct/20 13:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1334:
URL: https://github.com/apache/hive/pull/1334


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498188)
Time Spent: 2.5h  (was: 2h 20m)

> Date.getTime() can be changed to System.currentTimeMillis()
> ---
>
> Key: HIVE-21611
> URL: https://issues.apache.org/jira/browse/HIVE-21611
> Project: Hive
>  Issue Type: Bug
>Reporter: bd2019us
>Assignee: Hunter Logan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 1.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Hello,
> I found that System.currentTimeMillis() can be used here instead of new 
> Date.getTime().
> Since new Date() is a thin wrapper of light method 
> System.currentTimeMillis(). The performance will be greatly damaged if it is 
> invoked too much times.
> According to my local testing at the same environment, 
> System.currentTimeMillis() can achieve a speedup to 5 times (435 ms vs 2073 
> ms), when these two methods are invoked 5,000,000 times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24253) HMS needs to support keystore/truststores types besides JKS

2020-10-09 Thread Yongzhi Chen (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-24253:
---


> HMS needs to support keystore/truststores types besides JKS
> ---
>
> Key: HIVE-24253
> URL: https://issues.apache.org/jira/browse/HIVE-24253
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>
> When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support 
> the default keystore type specified for the JDK and not always use JKS. Same 
> as HIVE-23958 for hive, HMS should support to set additional 
> keystore/truststore types used for different applications like for FIPS 
> crypto algorithms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24225) FIX S3A recordReader policy selection

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24225?focusedWorklogId=498058&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498058
 ]

ASF GitHub Bot logged work on HIVE-24225:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:36
Start Date: 09/Oct/20 13:36
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #1547:
URL: https://github.com/apache/hive/pull/1547#issuecomment-705446922







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498058)
Time Spent: 1.5h  (was: 1h 20m)

> FIX S3A recordReader policy selection
> -
>
> Key: HIVE-24225
> URL: https://issues.apache.org/jira/browse/HIVE-24225
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Dynamic S3A recordReader policy selection can cause issues on lazy 
> initialized FS objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24203) Implement stats annotation rule for the LateralViewJoinOperator

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24203?focusedWorklogId=498075&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498075
 ]

ASF GitHub Bot logged work on HIVE-24203:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:37
Start Date: 09/Oct/20 13:37
Worklog Time Spent: 10m 
  Work Description: okumin commented on a change in pull request #1531:
URL: https://github.com/apache/hive/pull/1531#discussion_r501689451



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
##
@@ -2921,6 +2920,97 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
 }
   }
 
+  /**
+   * LateralViewJoinOperator changes the data size and column level statistics.
+   *
+   * A diagram of LATERAL VIEW.
+   *
+   *   [Lateral View Forward]
+   *  / \
+   *[Select]  [Select]
+   *||
+   *| [UDTF]
+   *\   /
+   *   [Lateral View Join]
+   *
+   * For each row of the source, the left branch just picks columns and the 
right branch processes UDTF.
+   * And then LVJ joins a row from the left branch with rows from the right 
branch.
+   * The join has one-to-many relationship since UDTF can generate multiple 
rows.
+   *
+   * This rule multiplies the stats from the left branch by T(right) / T(left) 
and sums up the both sides.
+   */
+  public static class LateralViewJoinStatsRule extends DefaultStatsRule 
implements SemanticNodeProcessor {
+@Override
+public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx,
+  Object... nodeOutputs) throws SemanticException {
+  final LateralViewJoinOperator lop = (LateralViewJoinOperator) nd;
+  final AnnotateStatsProcCtx aspCtx = (AnnotateStatsProcCtx) procCtx;
+  final HiveConf conf = aspCtx.getConf();
+
+  if (!isAllParentsContainStatistics(lop)) {
+return null;
+  }
+
+  final List> parents = 
lop.getParentOperators();
+  if (parents.size() != 2) {
+LOG.warn("LateralViewJoinOperator should have just two parents but 
actually has "
++ parents.size() + " parents.");
+return null;
+  }
+
+  final Statistics selectStats = 
parents.get(LateralViewJoinOperator.SELECT_TAG).getStatistics();
+  final Statistics udtfStats = 
parents.get(LateralViewJoinOperator.UDTF_TAG).getStatistics();
+
+  final double factor = (double) udtfStats.getNumRows() / (double) 
selectStats.getNumRows();
+  final long selectDataSize = 
StatsUtils.safeMult(selectStats.getDataSize(), factor);
+  final long dataSize = StatsUtils.safeAdd(selectDataSize, 
udtfStats.getDataSize());
+  Statistics joinedStats = new Statistics(udtfStats.getNumRows(), 
dataSize, 0, 0);
+
+  if (satisfyPrecondition(selectStats) && satisfyPrecondition(udtfStats)) {
+final Map columnExprMap = lop.getColumnExprMap();
+final RowSchema schema = lop.getSchema();
+
+joinedStats.updateColumnStatsState(selectStats.getColumnStatsState());
+final List selectColStats = StatsUtils
+.getColStatisticsFromExprMap(conf, selectStats, columnExprMap, 
schema);
+joinedStats.addToColumnStats(multiplyColStats(selectColStats, factor));
+
+joinedStats.updateColumnStatsState(udtfStats.getColumnStatsState());
+final List udtfColStats = StatsUtils
+.getColStatisticsFromExprMap(conf, udtfStats, columnExprMap, 
schema);
+joinedStats.addToColumnStats(udtfColStats);
+
+joinedStats = applyRuntimeStats(aspCtx.getParseContext().getContext(), 
joinedStats, lop);
+lop.setStatistics(joinedStats);
+
+if (LOG.isDebugEnabled()) {
+  LOG.debug("[0] STATS-" + lop.toString() + ": " + 
joinedStats.extendedToString());
+}

Review comment:
   I also agree and I did that.
   
https://github.com/apache/hive/pull/1531/commits/d333d5d70184a1cf1f0c0f239e9229965e486202

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
##
@@ -2921,6 +2920,97 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
 }
   }
 
+  /**
+   * LateralViewJoinOperator changes the data size and column level statistics.
+   *
+   * A diagram of LATERAL VIEW.
+   *
+   *   [Lateral View Forward]
+   *  / \
+   *[Select]  [Select]
+   *||
+   *| [UDTF]
+   *\   /
+   *   [Lateral View Join]
+   *
+   * For each row of the source, the left branch just picks columns and the 
right branch processes UDTF.
+   * And then LVJ joins a row from the left branch with rows from the right 
branch.
+   * The join has one-to-many relationship since UDTF can generate multiple 
rows.
+   *
+   * This rule multiplies the stats from the left branch b

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498169
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:45
Start Date: 09/Oct/20 13:45
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on pull request #1565:
URL: https://github.com/apache/hive/pull/1565#issuecomment-706053133


   I run this test locally using current master and got the same result like in 
this PR.
   Let's wait for flaky test run results.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498169)
Time Spent: 1h  (was: 50m)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23948) Improve Query Results Cache

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23948?focusedWorklogId=498163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498163
 ]

ASF GitHub Bot logged work on HIVE-23948:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:45
Start Date: 09/Oct/20 13:45
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1335:
URL: https://github.com/apache/hive/pull/1335


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498163)
Time Spent: 1h  (was: 50m)

> Improve Query Results Cache
> ---
>
> Key: HIVE-23948
> URL: https://issues.apache.org/jira/browse/HIVE-23948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Creating a Jira for this github PR from before github was actively used
> [https://github.com/apache/hive/pull/652]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=498134&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498134
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:42
Start Date: 09/Oct/20 13:42
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1548:
URL: https://github.com/apache/hive/pull/1548#discussion_r499754423



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -2839,6 +2848,87 @@ public static void setNonTransactional(Map tblProps) {
 tblProps.remove(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES);
   }
 
+  /**
+   * Look for delta directories matching the list of writeIds and deletes them.
+   * @param rootPartition root partition to look for the delta directories
+   * @param conf configuration
+   * @param writeIds list of writeIds to look for in the delta directories
+   * @return list of deleted directories.
+   * @throws IOException
+   */
+  public static List deleteDeltaDirectories(Path rootPartition, 
Configuration conf, Set writeIds)
+  throws IOException {
+FileSystem fs = rootPartition.getFileSystem(conf);
+
+PathFilter filter = (p) -> {
+  String name = p.getName();
+  for (Long wId : writeIds) {
+if (name.startsWith(deltaSubdir(wId, wId)) && !name.contains("=")) {

Review comment:
   changed, included delete_delta as well

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -97,9 +100,9 @@ public void run() {
   long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner();
   LOG.info("Cleaning based on min open txn id: " + minOpenTxnId);
   List cleanerList = new ArrayList<>();
-  for(CompactionInfo compactionInfo : txnHandler.findReadyToClean()) {
+  for (CompactionInfo compactionInfo : txnHandler.findReadyToClean()) {
 
cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(()
 ->
-clean(compactionInfo, minOpenTxnId)), cleanerExecutor));
+  clean(compactionInfo, minOpenTxnId)), cleanerExecutor));

Review comment:
   In original patch Map tableLock = new 
ConcurrentHashMap<>() was used to prevent  a concurrent p-clean (where the 
whole table will be scanned). I think, that is resolved by grouping p-cleans 
and recording list of writeIds that needs to be removed. @vpnvishv is that 
correct? 
   
   
   

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -97,9 +100,9 @@ public void run() {
   long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner();
   LOG.info("Cleaning based on min open txn id: " + minOpenTxnId);
   List cleanerList = new ArrayList<>();
-  for(CompactionInfo compactionInfo : txnHandler.findReadyToClean()) {
+  for (CompactionInfo compactionInfo : txnHandler.findReadyToClean()) {
 
cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(()
 ->
-clean(compactionInfo, minOpenTxnId)), cleanerExecutor));
+  clean(compactionInfo, minOpenTxnId)), cleanerExecutor));

Review comment:
   In original patch Map tableLock = new 
ConcurrentHashMap<>() was used to prevent  a concurrent p-clean (where the 
whole table will be scanned). I think, that is resolved by grouping p-cleans 
and recording list of writeIds that needs to be removed. @vpnvishv is that 
correct? Also we do not allow concurrent Cleaners, their execution is mutexed.
   
   
   

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -97,9 +100,9 @@ public void run() {
   long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner();
   LOG.info("Cleaning based on min open txn id: " + minOpenTxnId);
   List cleanerList = new ArrayList<>();
-  for(CompactionInfo compactionInfo : txnHandler.findReadyToClean()) {
+  for (CompactionInfo compactionInfo : txnHandler.findReadyToClean()) {
 
cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(()
 ->
-clean(compactionInfo, minOpenTxnId)), cleanerExecutor));
+  clean(compactionInfo, minOpenTxnId)), cleanerExecutor));

Review comment:
   1. In original patch Map tableLock = 
new ConcurrentHashMap<>() was used to prevent  a concurrent p-clean (where the 
whole table will be scanned). I think, that is resolved by grouping p-cleans 
and recording list of writeIds that needs to be removed. @vpnvishv is that 
correct? Also we do not allow concurrent Cleaners, their execution is mutexed.
   
   2. was related to the following issue based on Map tableLock = new Conc

[jira] [Work logged] (HIVE-23800) Add hooks when HiveServer2 stops due to OutOfMemoryError

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23800?focusedWorklogId=498131&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498131
 ]

ASF GitHub Bot logged work on HIVE-23800:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:42
Start Date: 09/Oct/20 13:42
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on a change in pull request #1205:
URL: https://github.com/apache/hive/pull/1205#discussion_r502158480



##
File path: ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java
##
@@ -45,7 +47,50 @@
 public class HookContext {
 
   static public enum HookType {
-PRE_EXEC_HOOK, POST_EXEC_HOOK, ON_FAILURE_HOOK
+

Review comment:
   Checked on my test and production env,  it shows that the hooks compiled 
for the old api can be reused without any changes  with the new implementation.

##
File path: ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java
##
@@ -39,57 +36,27 @@
 import org.apache.hadoop.hive.ql.parse.HiveSemanticAnalyzerHook;
 import org.apache.hadoop.hive.ql.parse.HiveSemanticAnalyzerHookContext;
 import org.apache.hadoop.hive.ql.session.SessionState;
-import org.apache.hadoop.hive.ql.session.SessionState.LogHelper;
 import org.apache.hive.common.util.HiveStringUtils;
 
+import static org.apache.hadoop.hive.ql.hooks.HookContext.HookType.*;
+
 /**
  * Handles hook executions for {@link Driver}.
  */
 public class HookRunner {
 
   private static final String CLASS_NAME = Driver.class.getName();
   private final HiveConf conf;
-  private LogHelper console;
-  private List queryHooks = new ArrayList<>();
-  private List saHooks = new ArrayList<>();
-  private List driverRunHooks = new ArrayList<>();
-  private List preExecHooks = new ArrayList<>();
-  private List postExecHooks = new ArrayList<>();
-  private List onFailureHooks = new ArrayList<>();
-  private boolean initialized = false;
+  private final HooksLoader loader;

Review comment:
   Rename it to HiveHooks instead.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498131)
Time Spent: 5.5h  (was: 5h 20m)

> Add hooks when HiveServer2 stops due to OutOfMemoryError
> 
>
> Key: HIVE-23800
> URL: https://issues.apache.org/jira/browse/HIVE-23800
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Make oom hook an interface of HiveServer2,  so user can implement the hook to 
> do something before HS2 stops, such as dumping the heap or altering the 
> devops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=498110&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498110
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:40
Start Date: 09/Oct/20 13:40
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #1565:
URL: https://github.com/apache/hive/pull/1565#issuecomment-706028333


   @kasakrisz could you please take a look? I think we have the same resultset 
in a different order...
   so SORT_QUERY_RESULTS should have ignored this difference



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498110)
Time Spent: 50m  (was: 40m)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24244) NPE during Atlas metadata replication

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24244?focusedWorklogId=498050&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498050
 ]

ASF GitHub Bot logged work on HIVE-24244:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:31
Start Date: 09/Oct/20 13:31
Worklog Time Spent: 10m 
  Work Description: pkumarsinha opened a new pull request #1563:
URL: https://github.com/apache/hive/pull/1563


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498050)
Time Spent: 40m  (was: 0.5h)

> NPE during Atlas metadata replication
> -
>
> Key: HIVE-24244
> URL: https://issues.apache.org/jira/browse/HIVE-24244
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24244.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=498048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498048
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:30
Start Date: 09/Oct/20 13:30
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #1271:
URL: https://github.com/apache/hive/pull/1271#issuecomment-705332460


   @kgyrtkirk Thank you taking your time in reviewing this!
   
   Yes, +1 from my side for (deprecating) kyro stuffs. String based approach is 
cool. But i am not sure how easy or difficult will it be to do the changes. I 
will try to explore this from my side as well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498048)
Time Spent: 5h 20m  (was: 5h 10m)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Ms

[jira] [Work logged] (HIVE-24242) Relax safety checks in SharedWorkOptimizer

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24242?focusedWorklogId=498021&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498021
 ]

ASF GitHub Bot logged work on HIVE-24242:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:28
Start Date: 09/Oct/20 13:28
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1564:
URL: https://github.com/apache/hive/pull/1564


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498021)
Time Spent: 20m  (was: 10m)

> Relax safety checks in SharedWorkOptimizer
> --
>
> Key: HIVE-24242
> URL: https://issues.apache.org/jira/browse/HIVE-24242
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> there are some checks to lock out problematic cases
> For UnionOperator 
> [here|https://github.com/apache/hive/blob/1507d80fd47aad38b87bba4fd58c1427ba89dbbf/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java#L1571]
> This check could prevent the optimization even if the Union is only visible 
> from only 1 of the TS ops.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24236) Connection leak in TxnHandler

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24236?focusedWorklogId=498017&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-498017
 ]

ASF GitHub Bot logged work on HIVE-24236:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 13:27
Start Date: 09/Oct/20 13:27
Worklog Time Spent: 10m 
  Work Description: yongzhi merged pull request #1559:
URL: https://github.com/apache/hive/pull/1559


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 498017)
Time Spent: 1.5h  (was: 1h 20m)

> Connection leak in TxnHandler
> -
>
> Key: HIVE-24236
> URL: https://issues.apache.org/jira/browse/HIVE-24236
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We see failures in QE tests with cannot allocate connections errors. The 
> exception stack like following:
> {noformat}
> 2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler 
> (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in 
> heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, 
> general error (SQLState=null, ErrorCode=0)
> 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable 
> to select from transaction database 
> org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general 
> error
> at 
> org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
> at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
> at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
> at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at 
> org.apache.commons.pool.impl.GenericObjectPool.borrow

[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=497950&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497950
 ]

ASF GitHub Bot logged work on HIVE-24106:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 12:48
Start Date: 09/Oct/20 12:48
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on a change in pull request #1456:
URL: https://github.com/apache/hive/pull/1456#discussion_r502403149



##
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##
@@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws 
SQLException {
 // Poll on the operation status, till the operation is complete
 do {
   try {
+if (Thread.currentThread().isInterrupted()) {
+  throw new SQLException("Interrupted while polling on the operation 
status", "70100");

Review comment:
   Done, thank you very much! @kgyrtkirk 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497950)
Time Spent: 40m  (was: 0.5h)

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2020-10-09 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210869#comment-17210869
 ] 

Hyukjin Kwon commented on HIVE-16391:
-

SPARK-20202 is resolved now. Spark does not use Hive 1.2 fork anymore, and does 
not need 1.2.x release. I am tentatively resolving this ticket.

> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Affects Versions: 1.2.2
>Reporter: Reynold Xin
>Assignee: Saisai Shao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.2.3
>
> Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2020-10-09 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated HIVE-16391:

Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Affects Versions: 1.2.2
>Reporter: Reynold Xin
>Assignee: Saisai Shao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.2.3
>
> Attachments: HIVE-16391.1.patch, HIVE-16391.2.patch, HIVE-16391.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24252) Improve decision model for using semijoin reducers

2020-10-09 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24252:
--


> Improve decision model for using semijoin reducers
> --
>
> Key: HIVE-24252
> URL: https://issues.apache.org/jira/browse/HIVE-24252
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> After a few experiments with TPC-DS 10TB dataset, we observed that in some 
> cases semijoin reducers were not effective; they didn't reduce the number of 
> records or they reduced the relation only a tiny bit. 
> In some cases we can make the semijoin reducer more effective by adding more 
> columns but this requires also a bigger bloom filter so the decision for the 
> number of columns to include in the bloom becomes more delicate.
> The current decision model always chooses multi-column semijoin reducers if 
> they are available but this may not always beneficial if the a single column 
> can reduce significantly the target relation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24251) Improve bloom filter size estimation for multi column semijoin reducers

2020-10-09 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24251:
--


> Improve bloom filter size estimation for multi column semijoin reducers
> ---
>
> Key: HIVE-24251
> URL: https://issues.apache.org/jira/browse/HIVE-24251
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> There are various cases where the expected size of the bloom filter is 
> largely underestimated  making the semijoin reducer completely ineffective. 
> This more relevant for multi-column semi join reducers since the current 
> [code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273]
>  does not take them into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497884&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497884
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 11:10
Start Date: 09/Oct/20 11:10
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1565:
URL: https://github.com/apache/hive/pull/1565#issuecomment-706119583


   @kasakrisz @kgyrtkirk The flaky test runs successfully:
   http://ci.hive.apache.org/job/hive-flaky-check/126/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497884)
Time Spent: 40m  (was: 0.5h)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24244) NPE during Atlas metadata replication

2020-10-09 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210834#comment-17210834
 ] 

Aasha Medhi commented on HIVE-24244:


+1

> NPE during Atlas metadata replication
> -
>
> Key: HIVE-24244
> URL: https://issues.apache.org/jira/browse/HIVE-24244
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24244.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24234) Improve checkHashModeEfficiency in VectorGroupByOperator

2020-10-09 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-24234:
---

Assignee: Rajesh Balamohan

> Improve checkHashModeEfficiency in VectorGroupByOperator
> 
>
> Key: HIVE-24234
> URL: https://issues.apache.org/jira/browse/HIVE-24234
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24234.wip.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, {{VectorGroupByOperator::checkHashModeEfficiency}} compares the 
> number of entries with the number input records that have been processed. For 
> grouping sets, it accounts for grouping set length as well.
> Issue is that, the condition becomes invalid after processing large number of 
> input records. This prevents the system from switching over to streaming 
> mode. 
> e.g Assume 500,000 input records processed, with 9 grouping sets, with 
> 100,000 entries in hashtable. Hashtable would never cross 4,500, entries 
> as the max size itself is 1M by default. 
> It would be good to compare the input records (adjusted for grouping sets) 
> with number of output records (along with size of hashtable size) to 
> determine hashing or streaming mode.
> E.g Q67.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24249) Create View fails if a materialized view exists with the same query

2020-10-09 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-24249:
--
Description: 
{code:java}
create table t1(col0 int) STORED AS ORC
  TBLPROPERTIES ('transactional'='true');

create materialized view mv1 as
select * from t1 where col0 > 2;

create view v1 as
select sub.* from (select * from t1 where col0 > 2) sub
where sub.col0 = 10;
{code}
The planner realize that the view definition has a subquery which match the 
materialized view query and replaces it to the materialized view scan.
{code:java}
HiveProject($f0=[CAST(10):INTEGER])
  HiveFilter(condition=[=(10, $0)])
HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1])
{code}
Then exception is thrown:
{code:java}
 org.apache.hadoop.hive.ql.parse.SemanticException: View definition references 
materialized view default.mv1
at 
org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211)
at 
org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.apache.hadoo

[jira] [Commented] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210750#comment-17210750
 ] 

Zhihua Deng commented on HIVE-24248:


Thanks much for your time [~kgyrtkirk] , I added a flaky check for this: 

[http://ci.hive.apache.org/job/hive-flaky-check/126/]

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24249) Create View fails if a materialized view exists with the same query

2020-10-09 Thread Krisztian Kasa (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210732#comment-17210732
 ] 

Krisztian Kasa commented on HIVE-24249:
---

A possible solution could be disabling materialized view rewrite during view 
creation.
cc [~jcamachorodriguez]

> Create View fails if a materialized view exists with the same query
> ---
>
> Key: HIVE-24249
> URL: https://issues.apache.org/jira/browse/HIVE-24249
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> {code:java}
> create table t1(col0 int) STORED AS ORC
>   TBLPROPERTIES ('transactional'='true');
> create materialized view mv1 as
> select * from t1 where col0 > 2;
> create view mv1 as
> select sub.* from (select * from t1 where col0 > 2) sub
> where sub.col0 = 10;
> {code}
> The planner realize that the view definition has a subquery which match the 
> materialized view query and replaces it to the materialized view scan.
> {code:java}
> HiveProject($f0=[CAST(10):INTEGER])
>   HiveFilter(condition=[=(10, $0)])
> HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1])
> {code}
> Then exception is thrown:
> {code:java}
>  org.apache.hadoop.hive.ql.parse.SemanticException: View definition 
> references materialized view default.mv1
>   at 
> org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
>   at 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.run

[jira] [Assigned] (HIVE-24249) Create View fails if a materialized view exists with the same query

2020-10-09 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-24249:
-


> Create View fails if a materialized view exists with the same query
> ---
>
> Key: HIVE-24249
> URL: https://issues.apache.org/jira/browse/HIVE-24249
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> {code:java}
> create table t1(col0 int) STORED AS ORC
>   TBLPROPERTIES ('transactional'='true');
> create materialized view mv1 as
> select * from t1 where col0 > 2;
> create view mv1 as
> select sub.* from (select * from t1 where col0 > 2) sub
> where sub.col0 = 10;
> {code}
> The planner realize that the view definition has a subquery which match the 
> materialized view query and replaces it to the materialized view scan.
> {code:java}
> HiveProject($f0=[CAST(10):INTEGER])
>   HiveFilter(condition=[=(10, $0)])
> HiveTableScan(table=[[default, mv1]], table:alias=[default.mv1])
> {code}
> Then exception is thrown:
> {code:java}
>  org.apache.hadoop.hive.ql.parse.SemanticException: View definition 
> references materialized view default.mv1
>   at 
> org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.validateCreateView(CreateViewAnalyzer.java:211)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.create.CreateViewAnalyzer.analyzeInternal(CreateViewAnalyzer.java:99)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:174)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:415)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:364)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:358)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:203)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:129)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:355)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
>   at 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:135)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at org.junit.runners.Suite.runChil

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497826&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497826
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 08:42
Start Date: 09/Oct/20 08:42
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on pull request #1565:
URL: https://github.com/apache/hive/pull/1565#issuecomment-706053133


   I run this test locally using current master and got the same result like in 
this PR.
   Let's wait for flaky test run results.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497826)
Time Spent: 0.5h  (was: 20m)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=497822&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497822
 ]

ASF GitHub Bot logged work on HIVE-24106:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 08:34
Start Date: 09/Oct/20 08:34
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1456:
URL: https://github.com/apache/hive/pull/1456#discussion_r502273628



##
File path: jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
##
@@ -360,6 +360,9 @@ TGetOperationStatusResp waitForOperationToComplete() throws 
SQLException {
 // Poll on the operation status, till the operation is complete
 do {
   try {
+if (Thread.currentThread().isInterrupted()) {
+  throw new SQLException("Interrupted while polling on the operation 
status", "70100");

Review comment:
   I think this error message and code should be placed in the `ErrorMsg` 
class





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497822)
Time Spent: 0.5h  (was: 20m)

> Abort polling on the operation state when the current thread is interrupted
> ---
>
> Key: HIVE-24106
> URL: https://issues.apache.org/jira/browse/HIVE-24106
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If running HiveStatement asynchronously as a task like in a thread or future, 
>  if we interrupt the task,  the HiveStatement would continue to poll on the 
> operation state until finish. It's may better to provide a way to abort the 
> executing in such case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?focusedWorklogId=497813&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497813
 ]

ASF GitHub Bot logged work on HIVE-23851:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 08:22
Start Date: 09/Oct/20 08:22
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1271:
URL: https://github.com/apache/hive/pull/1271


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497813)
Time Spent: 5h 10m  (was: 5h)

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52
>  ) hence the query fails with Failed to deseriali

[jira] [Resolved] (HIVE-23851) MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions

2020-10-09 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-23851.
-
Resolution: Fixed

merged into master. Thank you Syed Shameerur Rahman!

> MSCK REPAIR Command With Partition Filtering Fails While Dropping Partitions
> 
>
> Key: HIVE-23851
> URL: https://issues.apache.org/jira/browse/HIVE-23851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> # Create external table
> # Run msck command to sync all the partitions with metastore
> # Remove one of the partition path
> # Run msck repair with partition filtering
> *Stack Trace:*
> {code:java}
>  2020-07-15T02:10:29,045 ERROR [4dad298b-28b1-4e6b-94b6-aa785b60c576 main] 
> ppr.PartitionExpressionForMetastore: Failed to deserialize the expression
>  java.lang.IndexOutOfBoundsException: Index: 110, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_192]
>  at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_192]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:707) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:211)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeObjectFromKryo(SerializationUtilities.java:806)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.deserializeExpressionFromKryo(SerializationUtilities.java:775)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.deserializeExpr(PartitionExpressionForMetastore.java:96)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.convertExprToFilter(PartitionExpressionForMetastore.java:52)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.PartFilterExprUtil.makeExpressionTree(PartFilterExprUtil.java:48)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3593)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.metastore.VerifyingObjectStore.getPartitionsByExpr(VerifyingObjectStore.java:80)
>  [hive-standalone-metastore-server-4.0.0-SNAPSHOT-tests.jar:4.0.0-SNAPSHOT]
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_192]
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_192]
> {code}
> *Cause:*
> In case of msck repair with partition filtering we expect expression proxy 
> class to be set as PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/ddl/misc/msck/MsckAnalyzer.java#L78
>  ), While dropping partition we serialize the drop partition filter 
> expression as ( 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java#L589
>  ) which is incompatible during deserializtion happening in 
> PartitionExpressionForMetastore ( 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionExpressionForMetastore.java#L52
>  ) hence the query fails with Failed to deserialize the expression.
> *Solutions*:
> I could think of two approaches to this problem
> # Since PartitionExpressionForMetastore is required only during parition 
> pruning step, We can switch back the expression proxy class to 
> MsckPartitionExpressionProxy once the partition pruning step is done.
> # The other solution is to make serialization process in msck drop partition 
> filter expression compatible with the one with 
> PartitionExpressionForMetastore, We can do this via Reflection since the drop 
> partition serialization happens in Msck class (standadlone-metatsore) by this 
> way we can completely remove the need for class MsckPartitionExpressionProx

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497803&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497803
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 07:48
Start Date: 09/Oct/20 07:48
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #1565:
URL: https://github.com/apache/hive/pull/1565#issuecomment-706028333


   @kasakrisz could you please take a look? I think we have the same resultset 
in a different order...
   so SORT_QUERY_RESULTS should have ignored this difference



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497803)
Time Spent: 20m  (was: 10m)

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210686#comment-17210686
 ] 

Zoltan Haindrich edited comment on HIVE-24248 at 10/9/20, 7:45 AM:
---

Thank you [~dengzh] for opening this ticket - I was also about to do the same :)

I've disabled this test - please also run a flaky check before enabling it back
http://ci.hive.apache.org/job/hive-flaky-check/124/


was (Author: kgyrtkirk):
I've disabled this test - please also run a flaky check before enabling it back
http://ci.hive.apache.org/job/hive-flaky-check/124/

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210686#comment-17210686
 ] 

Zoltan Haindrich commented on HIVE-24248:
-

I've disabled this test - please also run a flaky check before enabling it back
http://ci.hive.apache.org/job/hive-flaky-check/124/

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?focusedWorklogId=497786&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497786
 ]

ASF GitHub Bot logged work on HIVE-24248:
-

Author: ASF GitHub Bot
Created on: 09/Oct/20 07:03
Start Date: 09/Oct/20 07:03
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1565:
URL: https://github.com/apache/hive/pull/1565


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497786)
Remaining Estimate: 0h
Time Spent: 10m

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24248) TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky

2020-10-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24248:
--
Labels: pull-request-available  (was: )

> TestMiniLlapLocalCliDriver[subquery_join_rewrite] is flaky
> --
>
> Key: HIVE-24248
> URL: https://issues.apache.org/jira/browse/HIVE-24248
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1205/26/tests]
> {code:java}
> java.lang.AssertionError:
> Client Execution succeeded but contained differences (error code = 1) after 
> executing subquery_join_rewrite.q
> 241,244d240
> < 1 1
> < 1 2
> < 2 1
> < 2 2
> 245a242,243
> > 2 2
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

84 matches

Mail list logo