[jira] [Commented] (HIVE-24988) Add support for complex types columns for Dynamic Partition pruning Optimisation

2021-04-07 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316875#comment-17316875
 ] 

Peter Vary commented on HIVE-24988:
---

Hi [~maheshk114], there is a current Jira trying to add DPP for Iceberg tables. 
If you have time could you please take a look at HIVE-24962?

Thanks,

Peter

> Add support for complex types columns for Dynamic Partition pruning 
> Optimisation
> 
>
> Key: HIVE-24988
> URL: https://issues.apache.org/jira/browse/HIVE-24988
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> DynamicPartitionPruningOptimization fails for complex types.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24961) Use database name as default in the mapred job name for replication

2021-04-07 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-24961.

Resolution: Fixed

> Use database name as default in the mapred job name for replication
> ---
>
> Key: HIVE-24961
> URL: https://issues.apache.org/jira/browse/HIVE-24961
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add database as job name for replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24961) Use database name as default in the mapred job name for replication

2021-04-07 Thread Aasha Medhi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316873#comment-17316873
 ] 

Aasha Medhi commented on HIVE-24961:


+1, Committed to master. Thank you for the patch [~ayushtkn]

> Use database name as default in the mapred job name for replication
> ---
>
> Key: HIVE-24961
> URL: https://issues.apache.org/jira/browse/HIVE-24961
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add database as job name for replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24961) Use database name as default in the mapred job name for replication

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24961?focusedWorklogId=578918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578918
 ]

ASF GitHub Bot logged work on HIVE-24961:
-

Author: ASF GitHub Bot
Created on: 08/Apr/21 05:41
Start Date: 08/Apr/21 05:41
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #2136:
URL: https://github.com/apache/hive/pull/2136


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578918)
Time Spent: 50m  (was: 40m)

> Use database name as default in the mapred job name for replication
> ---
>
> Key: HIVE-24961
> URL: https://issues.apache.org/jira/browse/HIVE-24961
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Add database as job name for replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24992) Incremental rebuild of MV having aggregate in presence of delete operation

2021-04-07 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-24992:
-


> Incremental rebuild of MV having aggregate in presence of delete operation
> --
>
> Key: HIVE-24992
> URL: https://issues.apache.org/jira/browse/HIVE-24992
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> Extension of HIVE-24854: handle cases when the Materialized view definition 
> has aggregation like
> {code}
> CREATE MATERIALIZED VIEW cmv_mat_view_n5 DISABLE REWRITE TBLPROPERTIES 
> ('transactional'='true') AS
>   SELECT cmv_basetable_n5.a, cmv_basetable_2_n2.c, sum(cmv_basetable_2_n2.d)
>   FROM cmv_basetable_n5 JOIN cmv_basetable_2_n2 ON (cmv_basetable_n5.a = 
> cmv_basetable_2_n2.a)
>   WHERE cmv_basetable_2_n2.c > 10.0
>   GROUP BY cmv_basetable_n5.a, cmv_basetable_2_n2.c;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-04-07 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-24991:
--
Fix Version/s: 4.0.0

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-04-07 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-24991:
--
Component/s: (was: ORC)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-04-07 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-24991:
--
Component/s: Vectorization
 ORC

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC, Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24976) CBO: count(distinct) in a window function fails CBO

2021-04-07 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-24976.
---
Resolution: Fixed

Pushed to master. Thanks [~jcamachorodriguez] for review.

> CBO: count(distinct) in a window function fails CBO
> ---
>
> Key: HIVE-24976
> URL: https://issues.apache.org/jira/browse/HIVE-24976
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> create temporary table tmp_tbl(
> `rule_id` string,
> `severity` string,
> `alert_id` string,
> `alert_type` string);
> explain cbo
> select `k`.`rule_id`,
> count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
> from tmp_tbl k
> ;
> explain
> select `k`.`rule_id`,
> count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
> from tmp_tbl k
> ;
> {code}
> Fails CBO, because the count(distinct) is not being recognized as belonging 
> to a windowing operation.
> So it throws the following exception
> {code}
> throw new CalciteSemanticException("Distinct without an 
> aggregation.",
> UnsupportedFeature.Distinct_without_an_aggreggation);
> {code}
> https://github.com/apache/hive/blob/73c3770d858b063c69dea6c64a759f8fdacad460/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4914
> This prevents a query like this from using a materialized view which already 
> exists in the system (the MV obviously does not contain this expression, but 
> represents a complex transform from a JSON structure into a columnar layout).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24976) CBO: count(distinct) in a window function fails CBO

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24976?focusedWorklogId=578885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578885
 ]

ASF GitHub Bot logged work on HIVE-24976:
-

Author: ASF GitHub Bot
Created on: 08/Apr/21 04:44
Start Date: 08/Apr/21 04:44
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #2155:
URL: https://github.com/apache/hive/pull/2155


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578885)
Time Spent: 20m  (was: 10m)

> CBO: count(distinct) in a window function fails CBO
> ---
>
> Key: HIVE-24976
> URL: https://issues.apache.org/jira/browse/HIVE-24976
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> create temporary table tmp_tbl(
> `rule_id` string,
> `severity` string,
> `alert_id` string,
> `alert_type` string);
> explain cbo
> select `k`.`rule_id`,
> count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
> from tmp_tbl k
> ;
> explain
> select `k`.`rule_id`,
> count(distinct `k`.`alert_id`) over(partition by `k`.`rule_id`) `subj_cnt`
> from tmp_tbl k
> ;
> {code}
> Fails CBO, because the count(distinct) is not being recognized as belonging 
> to a windowing operation.
> So it throws the following exception
> {code}
> throw new CalciteSemanticException("Distinct without an 
> aggregation.",
> UnsupportedFeature.Distinct_without_an_aggreggation);
> {code}
> https://github.com/apache/hive/blob/73c3770d858b063c69dea6c64a759f8fdacad460/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L4914
> This prevents a query like this from using a materialized view which already 
> exists in the system (the MV obviously does not contain this expression, but 
> represents a complex transform from a JSON structure into a columnar layout).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316825#comment-17316825
 ] 

Eva Mariam edited comment on HIVE-6013 at 4/8/21, 3:10 AM:
---

Tables with special character in column name are  created.
but the data for views created over tables having columns with special 
characters for delta tables is not fetched.

Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';

create view test_vw as select * from test;
{code}

com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: 
org.apache.spark.sql.AnalysisException: Attribute with name 'sta?s_mpkt' is not 
found in '(stats_broadcastpkt,sta”s_mpkt)';;


was (Author: evamariam):
Tables with special character in column name are  created.
but Views cannot be created over tables having columns with special characters 
for delta tables.

Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';

create view test-vw as select * from test;
{code}

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x94s_mu...' for 
column 'PARAM_VALUE' at row 1
Query is: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES 
(?,?,?) ,
 

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316825#comment-17316825
 ] 

Eva Mariam edited comment on HIVE-6013 at 4/8/21, 3:07 AM:
---

Tables with special character in column name are  created.
but Views cannot be created over tables having columns with special characters 
for delta tables.

Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';

create view test-vw as select * from test;
{code}

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x94s_mu...' for 
column 'PARAM_VALUE' at row 1
Query is: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES 
(?,?,?) ,
 


was (Author: evamariam):
Tables are not being created.
Also the Views cannot be created over tables having columns with special 
characters for delta tables.

Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
set hive.support.quoted.identifiers=column;

SET spark.sql.parser.quotedRegexColumnNames=true

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';
{code}

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x94s_mu...' for 
column 'PARAM_VALUE' at row 1
Query is: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES 
(?,?,?) ,
 

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316825#comment-17316825
 ] 

Eva Mariam edited comment on HIVE-6013 at 4/8/21, 2:54 AM:
---

Tables are not being created.
Also the Views cannot be created over tables having columns with special 
characters for delta tables.

Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
set hive.support.quoted.identifiers=column;

SET spark.sql.parser.quotedRegexColumnNames=true

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';
{code}

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x94s_mu...' for 
column 'PARAM_VALUE' at row 1
Query is: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES 
(?,?,?) ,
 


was (Author: evamariam):
Views cannot be created over tables having columns with special characters for 
delta tables
Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
set hive.support.quoted.identifiers=column;

SET spark.sql.parser.quotedRegexColumnNames=true

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';
{code}

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x94s_mu...' for 
column 'PARAM_VALUE' at row 1
Query is: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES 
(?,?,?) ,
 

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316825#comment-17316825
 ] 

Eva Mariam edited comment on HIVE-6013 at 4/8/21, 2:53 AM:
---

Views cannot be created over tables having columns with special characters for 
delta tables
Hive: 0.13
Databricks Runtime: 7.5 
in Azure

{code:java}
set hive.support.quoted.identifiers=column;

SET spark.sql.parser.quotedRegexColumnNames=true

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string) using 
delta location '/mnt/table';
{code}

Caused by: java.sql.SQLException: Incorrect string value: '\xC2\x94s_mu...' for 
column 'PARAM_VALUE' at row 1
Query is: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES 
(?,?,?) ,
 


was (Author: evamariam):
Views cannot be created over tables having columns with special characters

 set hive.support.quoted.identifiers=column;

SET spark.sql.parser.quotedRegexColumnNames=true

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string);

 

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24989) Support vectorisation of join with key columns of complex types

2021-04-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24989:
---
Description: 
Support for complex type is not present in add key.
{code:java}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected column 
vector type LISTCaused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
Unexpected column vector type LIST at 
org.apache.hadoop.hive.ql.exec.vector.VectorColumnSetInfo.addKey(VectorColumnSetInfo.java:138)
 at 
org.apache.hadoop.hive.ql.exec.vector.wrapper.VectorHashKeyWrapperBatch.compileKeyWrapperBatch(VectorHashKeyWrapperBatch.java:913)
 at 
org.apache.hadoop.hive.ql.exec.vector.wrapper.VectorHashKeyWrapperBatch.compileKeyWrapperBatch(VectorHashKeyWrapperBatch.java:894)
 at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.initializeOp(VectorMapJoinOperator.java:137)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360) at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549) at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503) 
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369) at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:332)
  {code}

  was:Hive fails to execute joins on array type columns as the comparison 
functions are not able to handle array type columns.   


> Support vectorisation of join with key columns of complex types
> ---
>
> Key: HIVE-24989
> URL: https://issues.apache.org/jira/browse/HIVE-24989
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Support for complex type is not present in add key.
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> column vector type LISTCaused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected column vector 
> type LIST at 
> org.apache.hadoop.hive.ql.exec.vector.VectorColumnSetInfo.addKey(VectorColumnSetInfo.java:138)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.wrapper.VectorHashKeyWrapperBatch.compileKeyWrapperBatch(VectorHashKeyWrapperBatch.java:913)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.wrapper.VectorHashKeyWrapperBatch.compileKeyWrapperBatch(VectorHashKeyWrapperBatch.java:894)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.initializeOp(VectorMapJoinOperator.java:137)
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360) at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549) at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503) 
> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369) at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:332)
>   {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24989) Support vectorisation of join with key columns of complex types

2021-04-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-24989:
--


> Support vectorisation of join with key columns of complex types
> ---
>
> Key: HIVE-24989
> URL: https://issues.apache.org/jira/browse/HIVE-24989
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316825#comment-17316825
 ] 

Eva Mariam edited comment on HIVE-6013 at 4/8/21, 2:22 AM:
---

Views cannot be created over tables having columns with special characters

 set hive.support.quoted.identifiers=column;

SET spark.sql.parser.quotedRegexColumnNames=true

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string);

 


was (Author: evamariam):
Views cannot be created over tables having columns with special characters

 

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string);

 

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eva Mariam updated HIVE-6013:
-
Comment: was deleted

(was: Views cannot be created over tables having columns with special 
characters.)

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eva Mariam reopened HIVE-6013:
--

Views cannot be created over tables having columns with special characters

 

create table test (`stats_broadcastpkt` string, `sta”s_mpkt` string);

 

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24988) Add support for complex types columns for Dynamic Partition pruning Optimisation

2021-04-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24988:
---
Description: DynamicPartitionPruningOptimization fails for complex types.   
 (was: Hive fails to execute joins on array type columns as the comparison 
functions are not able to handle array type columns.   )

> Add support for complex types columns for Dynamic Partition pruning 
> Optimisation
> 
>
> Key: HIVE-24988
> URL: https://issues.apache.org/jira/browse/HIVE-24988
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> DynamicPartitionPruningOptimization fails for complex types.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24988) Add support for complex types columns for Dynamic Partition pruning Optimisation

2021-04-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-24988:
--


> Add support for complex types columns for Dynamic Partition pruning 
> Optimisation
> 
>
> Key: HIVE-24988
> URL: https://issues.apache.org/jira/browse/HIVE-24988
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-6013) Supporting Quoted Identifiers in Column Names

2021-04-07 Thread Eva Mariam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316821#comment-17316821
 ] 

Eva Mariam commented on HIVE-6013:
--

Views cannot be created over tables having columns with special characters.

> Supporting Quoted Identifiers in Column Names
> -
>
> Key: HIVE-6013
> URL: https://issues.apache.org/jira/browse/HIVE-6013
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Harish Butani
>Assignee: Harish Butani
>Priority: Major
> Fix For: 0.13.0
>
> Attachments: HIVE-6013.1.patch, HIVE-6013.2.patch, HIVE-6013.3.patch, 
> HIVE-6013.4.patch, HIVE-6013.5.patch, HIVE-6013.6.patch, HIVE-6013.7.patch, 
> QuotedIdentifier.html
>
>
> Hive's current behavior on Quoted Identifiers is different from the normal 
> interpretation. Quoted Identifier (using backticks) has a special 
> interpretation for Select expressions(as Regular Expressions). Have 
> documented current behavior and proposed a solution in attached doc.
> Summary of solution is:
> - Introduce 'standard' quoted identifiers for columns only. 
> - At the langauage level this is turned on by a flag.
> - At the metadata level we relax the constraint on column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24883) Add support for complex types columns in Hive Joins

2021-04-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24883:
---
Summary: Add support for complex types columns in Hive Joins  (was: Add 
support for array type columns in Hive Joins)

> Add support for complex types columns in Hive Joins
> ---
>
> Key: HIVE-24883
> URL: https://issues.apache.org/jira/browse/HIVE-24883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive fails to execute joins on array type columns as the comparison functions 
> are not able to handle array type columns.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23820) [HS2] Send tableId in request for get_table_request API

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23820?focusedWorklogId=578692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578692
 ]

ASF GitHub Bot logged work on HIVE-23820:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 21:30
Start Date: 07/Apr/21 21:30
Worklog Time Spent: 10m 
  Work Description: kishendas commented on pull request #2153:
URL: https://github.com/apache/hive/pull/2153#issuecomment-815277837


   @ashish-kumar-sharma Are you planning to create one more PR to actually 
store the tableId in the session and to send the tableId in the request from 
the current session ? 
   For example I don't see any changes  in  "public Table getTable(final String 
dbName, final String tableName, boolean throwException, boolean 
checkTransactional, boolean getColumnStats)" method in Hive.java, where you 
might want to include the tableId in the request which is cached at the session 
level. Also, don't see how you are storing the tableId in the session. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578692)
Time Spent: 50m  (was: 40m)

> [HS2] Send tableId in request for get_table_request API
> ---
>
> Key: HIVE-23820
> URL: https://issues.apache.org/jira/browse/HIVE-23820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24962) Enable partition pruning for Iceberg tables

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24962?focusedWorklogId=578567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578567
 ]

ASF GitHub Bot logged work on HIVE-24962:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 19:03
Start Date: 07/Apr/21 19:03
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2137:
URL: https://github.com/apache/hive/pull/2137#issuecomment-815153195


   @jcamachor, @kgyrtkirk: The latest test failure seems flaky. Since I am not 
very familiar with this part of the code, I would appreciate if someone with 
more experience around compilation/execution could review.
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578567)
Time Spent: 20m  (was: 10m)

> Enable partition pruning for Iceberg tables
> ---
>
> Key: HIVE-24962
> URL: https://issues.apache.org/jira/browse/HIVE-24962
> Project: Hive
>  Issue Type: Improvement
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should enable partition pruning above iceberg tables



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23571?focusedWorklogId=578515=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578515
 ]

ASF GitHub Bot logged work on HIVE-23571:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 17:56
Start Date: 07/Apr/21 17:56
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #2128:
URL: https://github.com/apache/hive/pull/2128#discussion_r608870539



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -2476,6 +2499,25 @@ public boolean isTableConstraintValid(String catName, 
String dbName, String tblN
 return isValid;
   }
 
+  public boolean isTableCacheStale(String catName, String dbName, String 
tblName, String validWriteIdList) {
+boolean isValid = false;

Review comment:
   nit: rename the variable to `isStale` instead? Below we are setting 
isValid to true when the the writeIdList in CachedStore is stale/older than 
client.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578515)
Time Spent: 40m  (was: 0.5h)

> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24912) Support to add repl.target.for property during incremental run

2021-04-07 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha resolved HIVE-24912.
-
Resolution: Fixed

Committed to master. Thanks for the patch [~haymant]  !!

> Support to add repl.target.for property during incremental run
> --
>
> Key: HIVE-24912
> URL: https://issues.apache.org/jira/browse/HIVE-24912
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24912) Support to add repl.target.for property during incremental run

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24912?focusedWorklogId=578485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578485
 ]

ASF GitHub Bot logged work on HIVE-24912:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 17:09
Start Date: 07/Apr/21 17:09
Worklog Time Spent: 10m 
  Work Description: pkumarsinha merged pull request #2092:
URL: https://github.com/apache/hive/pull/2092


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578485)
Time Spent: 2h 20m  (was: 2h 10m)

> Support to add repl.target.for property during incremental run
> --
>
> Key: HIVE-24912
> URL: https://issues.apache.org/jira/browse/HIVE-24912
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23571?focusedWorklogId=578477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578477
 ]

ASF GitHub Bot logged work on HIVE-23571:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 17:06
Start Date: 07/Apr/21 17:06
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #2128:
URL: https://github.com/apache/hive/pull/2128#discussion_r608511275



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -2476,6 +2499,25 @@ public boolean isTableConstraintValid(String catName, 
String dbName, String tblN
 return isValid;
   }
 
+  public boolean isTableCacheStale(String catName, String dbName, String 
tblName, String validWriteIdList) {
+boolean isValid = false;
+
+if (StringUtils.isEmpty(validWriteIdList))
+  return false;

Review comment:
   nit: add braces.
   
   Also, should we return true? if validWriteId is not present, we can assume 
that table is not loaded properly and it should be fetched from metastore db.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578477)
Time Spent: 0.5h  (was: 20m)

> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24912) Support to add repl.target.for property during incremental run

2021-04-07 Thread Pravin Sinha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316496#comment-17316496
 ] 

Pravin Sinha commented on HIVE-24912:
-

+1

> Support to add repl.target.for property during incremental run
> --
>
> Key: HIVE-24912
> URL: https://issues.apache.org/jira/browse/HIVE-24912
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578437=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578437
 ]

ASF GitHub Bot logged work on HIVE-24906:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 16:06
Start Date: 07/Apr/21 16:06
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2089:
URL: https://github.com/apache/hive/pull/2089#discussion_r608793866



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final 
CreateTableRequest req)
   }
 
   if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) {
-if (tbl.getSd().getLocation() == null
-|| tbl.getSd().getLocation().isEmpty()) {
-  tblPath = wh.getDefaultTablePath(db, tbl);
+if (tbl.getSd().getLocation() == null || 
tbl.getSd().getLocation().isEmpty()) {
+  String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + 
tbl.getTxnId() : "");
+  tblPath = wh.getDefaultTablePath(db, relPath, isExternal(tbl));

Review comment:
   can't we have a variation of this method which accepts a full "table" - 
and can decide the path ; and the externalability based on that...so that we 
don't spread the above line everywhere this method is invoked (I'm also working 
on a patch which invokes this method)

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final 
CreateTableRequest req)
   }
 
   if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) {

Review comment:
   this conditional doesn't seem to only match "managed" tables - could 
that cause any issues?
   
   with the "translator" in place I think we are mangling the location at too 
many places...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578437)
Time Spent: 1h  (was: 50m)

> Suffix the table location with UUID/txnId
> -
>
> Key: HIVE-24906
> URL: https://issues.apache.org/jira/browse/HIVE-24906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Suffixing the table location during create table with UUID/txnId can help in 
> deleting the data in asynchronous fashion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24406) HiveConf creation in HiveMetastore's getMS() call adds ~100ms latency

2021-04-07 Thread Sahana Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316450#comment-17316450
 ] 

Sahana Bhat commented on HIVE-24406:


The latest Hive version doesn't have the extra HiveConf object being created. 
Instead a Configuration object gets initialised with the threadlocal conf 
variable and so, this added latency seen in hive 2.3.4 is not an issue with the 
latest version. Closing this ticket as a no-op.

> HiveConf creation in HiveMetastore's getMS() call adds ~100ms latency
> -
>
> Key: HIVE-24406
> URL: https://issues.apache.org/jira/browse/HIVE-24406
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.4
>Reporter: Sahana Bhat
>Assignee: Sahana Bhat
>Priority: Major
> Attachments: Screenshot 2020-11-17 at 5.17.50 PM.png
>
>
> The changes to HMSHandler's getMSForConf introduced in the commit 
> [https://github.com/apache/hive/commit/9a47cf9f92d1c8a4e72890e3dfe2d9567f12bfb5]
>  makes getMSForConf and newRawStoreForConf static and hence adds an 
> additional HiveConf object creation step for every metastore connection 
> created.
> In a client like Presto, which creates a new metastore connection for every 
> query, the object creation rate shoots up leading to higher heap usage, more 
> frequent garbage collection, higher garbage collection times and increased 
> latency in the service. We noticed a constant ~100ms increase in latency of 
> all metastore calls during migration of HMS from 1x to 2x for Presto.
> PFA the latency difference of a get_table call between 1x and 2x. Min of 22ms 
> in 1x vs 103ms in 2x
> !Screenshot 2020-11-17 at 5.17.50 PM.png|width=591,height=222!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24406) HiveConf creation in HiveMetastore's getMS() call adds ~100ms latency

2021-04-07 Thread Sahana Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahana Bhat resolved HIVE-24406.

Resolution: Invalid

> HiveConf creation in HiveMetastore's getMS() call adds ~100ms latency
> -
>
> Key: HIVE-24406
> URL: https://issues.apache.org/jira/browse/HIVE-24406
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.4
>Reporter: Sahana Bhat
>Assignee: Sahana Bhat
>Priority: Major
> Attachments: Screenshot 2020-11-17 at 5.17.50 PM.png
>
>
> The changes to HMSHandler's getMSForConf introduced in the commit 
> [https://github.com/apache/hive/commit/9a47cf9f92d1c8a4e72890e3dfe2d9567f12bfb5]
>  makes getMSForConf and newRawStoreForConf static and hence adds an 
> additional HiveConf object creation step for every metastore connection 
> created.
> In a client like Presto, which creates a new metastore connection for every 
> query, the object creation rate shoots up leading to higher heap usage, more 
> frequent garbage collection, higher garbage collection times and increased 
> latency in the service. We noticed a constant ~100ms increase in latency of 
> all metastore calls during migration of HMS from 1x to 2x for Presto.
> PFA the latency difference of a get_table call between 1x and 2x. Min of 22ms 
> in 1x vs 103ms in 2x
> !Screenshot 2020-11-17 at 5.17.50 PM.png|width=591,height=222!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24986) Support aggregates on columns present in rollups

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24986:
--
Labels: pull-request-available  (was: )

> Support aggregates on columns present in rollups
> 
>
> Key: HIVE-24986
> URL: https://issues.apache.org/jira/browse/HIVE-24986
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> SELECT key, value, count(key) FROM src GROUP BY key, value with rollup;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24986) Support aggregates on columns present in rollups

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24986?focusedWorklogId=578387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578387
 ]

ASF GitHub Bot logged work on HIVE-24986:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 14:32
Start Date: 07/Apr/21 14:32
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #2159:
URL: https://github.com/apache/hive/pull/2159


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578387)
Remaining Estimate: 0h
Time Spent: 10m

> Support aggregates on columns present in rollups
> 
>
> Key: HIVE-24986
> URL: https://issues.apache.org/jira/browse/HIVE-24986
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> SELECT key, value, count(key) FROM src GROUP BY key, value with rollup;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24986) Support aggregates on columns present in rollups

2021-04-07 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316386#comment-17316386
 ] 

Zoltan Haindrich commented on HIVE-24986:
-

rollup/cube support feature was implemented back in HIVE-3433 ; but there were 
some conditionals added to protect against the grouping column appearing in 
aggregates

After I've removed the checks it worked without issues - I was looking around 
what could have been the cause...but I'm not sure.
I think in recent years we fixed some issues around empty row rollup/etc and 
probably other corner cases - so I guess we can now just allow it.

> Support aggregates on columns present in rollups
> 
>
> Key: HIVE-24986
> URL: https://issues.apache.org/jira/browse/HIVE-24986
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> {code}
> SELECT key, value, count(key) FROM src GROUP BY key, value with rollup;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24986) Support aggregates on columns present in rollups

2021-04-07 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-24986:
---


> Support aggregates on columns present in rollups
> 
>
> Key: HIVE-24986
> URL: https://issues.apache.org/jira/browse/HIVE-24986
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> {code}
> SELECT key, value, count(key) FROM src GROUP BY key, value with rollup;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24985) Create new metrics about locks

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?focusedWorklogId=578362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578362
 ]

ASF GitHub Bot logged work on HIVE-24985:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 13:33
Start Date: 07/Apr/21 13:33
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #2158:
URL: https://github.com/apache/hive/pull/2158#discussion_r608657504



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java
##
@@ -86,22 +88,21 @@ private void collectMetrics() throws MetaException {
 
   private void updateDBMetrics() throws MetaException {
 MetricsInfo metrics = txnHandler.getMetricsInfo();
-Metrics.getOrCreateGauge(MetricsConstants.COMPACTION_STATUS_PREFIX + 
"txn_to_writeid").set(
-metrics.getTxnToWriteIdCount());
-Metrics.getOrCreateGauge(MetricsConstants.COMPACTION_STATUS_PREFIX + 
"completed_txn_components").set(
-metrics.getCompletedTxnsCount());
-Metrics.getOrCreateGauge(MetricsConstants.COMPACTION_STATUS_PREFIX + 
"open_txn").set(
-metrics.getOpenTxnsCount());
-Metrics.getOrCreateGauge(MetricsConstants.OLDEST_OPEN_TXN_ID ).set(
-  metrics.getOldestOpenTxnId());
-Metrics.getOrCreateGauge(MetricsConstants.OLDEST_OPEN_TXN_AGE ).set(
-metrics.getOldestOpenTxnAge());
-Metrics.getOrCreateGauge(MetricsConstants.NUM_ABORTED_TXNS).set(
-metrics.getAbortedTxnsCount());
-Metrics.getOrCreateGauge(MetricsConstants.OLDEST_ABORTED_TXN_ID).set(
-metrics.getOldestAbortedTxnId());
-Metrics.getOrCreateGauge(MetricsConstants.OLDEST_ABORTED_TXN_AGE).set(
-metrics.getOldestAbortedTxnAge());
+
Metrics.getOrCreateGauge(NUM_TXN_TO_WRITEID).set(metrics.getTxnToWriteIdCount());
+
Metrics.getOrCreateGauge(NUM_COMPLETED_TXN_COMPONENTS).set(metrics.getCompletedTxnsCount());
+
+// NOTE: AcidOpenTxnsCounterService has a duplicate countOpenTxns() 
functionality and could be disabled.
+// PS: make sure to update `numOpenTxns` counter in TxnHandler.
+Metrics.getOrCreateGauge(NUM_OPEN_TXNS).set(metrics.getOpenTxnsCount());
+
Metrics.getOrCreateGauge(OLDEST_OPEN_TXN_ID).set(metrics.getOldestOpenTxnId());
+
Metrics.getOrCreateGauge(OLDEST_OPEN_TXN_AGE).set(metrics.getOldestOpenTxnAge());
+
+
Metrics.getOrCreateGauge(NUM_ABORTED_TXNS).set(metrics.getAbortedTxnsCount());
+
Metrics.getOrCreateGauge(OLDEST_ABORTED_TXN_ID).set(metrics.getOldestAbortedTxnId());
+
Metrics.getOrCreateGauge(OLDEST_ABORTED_TXN_AGE).set(metrics.getOldestAbortedTxnAge());
+
+Metrics.getOrCreateGauge(NUM_LOCKS).set(metrics.getLocksCount());
+Metrics.getOrCreateGauge(OLDEST_LOCK_AGE).set(metrics.getOldestLockAge());

Review comment:
   Since these metrics aren't connected to compaction (they apply to 
non-acid tables) is there some other place we could collect them?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578362)
Time Spent: 20m  (was: 10m)

> Create new metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578345
 ]

ASF GitHub Bot logged work on HIVE-24906:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 13:04
Start Date: 07/Apr/21 13:04
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2089:
URL: https://github.com/apache/hive/pull/2089#discussion_r608632826



##
File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
##
@@ -593,8 +593,8 @@ struct Table {
   22: optional byte accessType,
   23: optional list requiredReadCapabilities,
   24: optional list requiredWriteCapabilities
-  25: optional i64 id, // id of the table. It will be ignored 
if set. It's only for
-// read purposed
+  25: optional i64 id, // id of the table. It will be ignored 
if set. It's only for read purposes
+  26: optional i64 txnId,  // txnId associated with the table 
creation

Review comment:
   Quick question: Why not do the whole stuff on client side, and without 
the HMS API change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578345)
Time Spent: 50m  (was: 40m)

> Suffix the table location with UUID/txnId
> -
>
> Key: HIVE-24906
> URL: https://issues.apache.org/jira/browse/HIVE-24906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Suffixing the table location during create table with UUID/txnId can help in 
> deleting the data in asynchronous fashion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578344
 ]

ASF GitHub Bot logged work on HIVE-24906:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 13:03
Start Date: 07/Apr/21 13:03
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2089:
URL: https://github.com/apache/hive/pull/2089#discussion_r608632190



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java
##
@@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final 
CreateTableRequest req)
   }
 
   if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) {
-if (tbl.getSd().getLocation() == null
-|| tbl.getSd().getLocation().isEmpty()) {
-  tblPath = wh.getDefaultTablePath(db, tbl);
+if (tbl.getSd().getLocation() == null || 
tbl.getSd().getLocation().isEmpty()) {
+  String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + 
tbl.getTxnId() : "");

Review comment:
   How is the new location looks like?
   ```
   /.db/_v
   ```
   
   This might get mixed up if there are old tables, or tables created with 
`hive.txn.nonblocking.droptable.enabled=false` and the name ends with _v1234. 
Small chance, but...
   
   What about using directories instead, like:
   ```
   /.db//v_
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578344)
Time Spent: 40m  (was: 0.5h)

> Suffix the table location with UUID/txnId
> -
>
> Key: HIVE-24906
> URL: https://issues.apache.org/jira/browse/HIVE-24906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Suffixing the table location during create table with UUID/txnId can help in 
> deleting the data in asynchronous fashion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578339
 ]

ASF GitHub Bot logged work on HIVE-24906:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 12:56
Start Date: 07/Apr/21 12:56
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2089:
URL: https://github.com/apache/hive/pull/2089#discussion_r608627251



##
File path: ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java
##
@@ -3239,4 +3247,72 @@ public void testFullTableReadLock() throws Exception {
 checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "tab_acid", 
null, locks);
 checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", 
"tab_not_acid", null, locks);
   }
-}
+
+  @Test
+  public void testNonBlockingDropAndReCreateTable() throws Exception {
+dropTable(new String[] {"tab_acid"});
+
+conf.setBoolVar(HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true);
+driver2.getConf().setBoolVar( 
HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true);
+
+driver.run("create table if not exists tab_acid (a int, b int) partitioned 
by (p string) " +
+  "stored as orc TBLPROPERTIES ('transactional'='true')");
+driver.run("insert into tab_acid partition(p) (a,b,p) 
values(1,2,'foo'),(3,4,'bar')");
+
+driver.compileAndRespond("select * from tab_acid");
+
+DbTxnManager txnMgr2 = (DbTxnManager) 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+swapTxnManager(txnMgr2);
+driver2.run("drop table if exists tab_acid");
+
+swapTxnManager(txnMgr);
+driver.run();
+
+FileSystem fs = FileSystem.get(conf);
+FileStatus[] stat = fs.listStatus(new 
Path(Paths.get("target/warehouse").toUri()),
+  p -> p.getName().startsWith("tab_acid_v"));
+if (1 == stat.length) {
+  Assert.fail("Table data was not removed from FS");
+}
+
+List res = new ArrayList();
+driver.getFetchTask().fetch(res);
+Assert.assertEquals("Non-empty resultset", 0, res.size());
+
+try {
+  driver.run("select * from tab_acid");
+} catch (CommandProcessorException ex) {
+  Assert.assertEquals(ErrorMsg.INVALID_TABLE.getErrorCode(), 
ex.getResponseCode());
+  
Assert.assertTrue(ex.getMessage().contains(ErrorMsg.INVALID_TABLE.getMsg("'tab_acid'")));
+}
+
+//re-create table with the same name
+driver.compileAndRespond("create table if not exists tab_acid (a int, b 
int) partitioned by (p string) " +
+  "stored as orc TBLPROPERTIES ('transactional'='true')");
+long txnId = txnMgr.getCurrentTxnId();
+driver.run();
+driver.run("insert into tab_acid partition(p) (a,b,p) 
values(1,2,'foo'),(3,4,'bar')");
+
+driver.run("select * from tab_acid ");
+res = new ArrayList();
+driver.getFetchTask().fetch(res);
+Assert.assertEquals("No records found", 2, res.size());

Review comment:
   nit: The message can be a little bit misleading, since we do not check 
for empty table




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578339)
Time Spent: 0.5h  (was: 20m)

> Suffix the table location with UUID/txnId
> -
>
> Key: HIVE-24906
> URL: https://issues.apache.org/jira/browse/HIVE-24906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Suffixing the table location during create table with UUID/txnId can help in 
> deleting the data in asynchronous fashion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578338=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578338
 ]

ASF GitHub Bot logged work on HIVE-24906:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 12:56
Start Date: 07/Apr/21 12:56
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2089:
URL: https://github.com/apache/hive/pull/2089#discussion_r608626658



##
File path: ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java
##
@@ -3239,4 +3247,72 @@ public void testFullTableReadLock() throws Exception {
 checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "tab_acid", 
null, locks);
 checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", 
"tab_not_acid", null, locks);
   }
-}
+
+  @Test
+  public void testNonBlockingDropAndReCreateTable() throws Exception {
+dropTable(new String[] {"tab_acid"});
+
+conf.setBoolVar(HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true);
+driver2.getConf().setBoolVar( 
HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true);
+
+driver.run("create table if not exists tab_acid (a int, b int) partitioned 
by (p string) " +
+  "stored as orc TBLPROPERTIES ('transactional'='true')");
+driver.run("insert into tab_acid partition(p) (a,b,p) 
values(1,2,'foo'),(3,4,'bar')");
+
+driver.compileAndRespond("select * from tab_acid");
+
+DbTxnManager txnMgr2 = (DbTxnManager) 
TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf);
+swapTxnManager(txnMgr2);
+driver2.run("drop table if exists tab_acid");
+
+swapTxnManager(txnMgr);
+driver.run();
+
+FileSystem fs = FileSystem.get(conf);
+FileStatus[] stat = fs.listStatus(new 
Path(Paths.get("target/warehouse").toUri()),
+  p -> p.getName().startsWith("tab_acid_v"));
+if (1 == stat.length) {
+  Assert.fail("Table data was not removed from FS");
+}
+
+List res = new ArrayList();
+driver.getFetchTask().fetch(res);
+Assert.assertEquals("Non-empty resultset", 0, res.size());
+
+try {
+  driver.run("select * from tab_acid");
+} catch (CommandProcessorException ex) {
+  Assert.assertEquals(ErrorMsg.INVALID_TABLE.getErrorCode(), 
ex.getResponseCode());
+  
Assert.assertTrue(ex.getMessage().contains(ErrorMsg.INVALID_TABLE.getMsg("'tab_acid'")));
+}
+
+//re-create table with the same name
+driver.compileAndRespond("create table if not exists tab_acid (a int, b 
int) partitioned by (p string) " +

Review comment:
   I would remove `if not exists` from the SQL, so we can detect if there 
is a problem with the test




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578338)
Time Spent: 20m  (was: 10m)

> Suffix the table location with UUID/txnId
> -
>
> Key: HIVE-24906
> URL: https://issues.apache.org/jira/browse/HIVE-24906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Suffixing the table location during create table with UUID/txnId can help in 
> deleting the data in asynchronous fashion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23820) [HS2] Send tableId in request for get_table_request API

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23820?focusedWorklogId=578320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578320
 ]

ASF GitHub Bot logged work on HIVE-23820:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 12:30
Start Date: 07/Apr/21 12:30
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #2153:
URL: https://github.com/apache/hive/pull/2153#discussion_r608608310



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##
@@ -2367,10 +2367,21 @@ public Table getTable(String catName, String dbName, 
String tableName,
 return getTable(catName, dbName, tableName, validWriteIdList, false, null);
   }
 
+  @Override
+  public Table getTable(String catName, String dbName, String tableName,

Review comment:
   @kishendas  @adesh-rao Changed all methods signatures to request model




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578320)
Time Spent: 40m  (was: 0.5h)

> [HS2] Send tableId in request for get_table_request API
> ---
>
> Key: HIVE-23820
> URL: https://issues.apache.org/jira/browse/HIVE-23820
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24383) Add Table type to HPL/SQL

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24383?focusedWorklogId=578301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578301
 ]

ASF GitHub Bot logged work on HIVE-24383:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 11:40
Start Date: 07/Apr/21 11:40
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #2130:
URL: https://github.com/apache/hive/pull/2130#discussion_r608575658



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
##
@@ -961,6 +987,22 @@ public Integer forRange(HplsqlParser.For_range_stmtContext 
ctx) {
 return 0; 
   }
 
+  public Integer unconditionalLoop(HplsqlParser.Unconditional_loop_stmtContext 
ctx) {
+trace(ctx, "UNCONDITIONAL LOOP - ENTERED");
+String label = exec.labelPop();
+while (true) {
+  exec.enterScope(Scope.Type.LOOP);
+  visit(ctx.block());
+  exec.leaveScope();
+  if (canContinue(label)) {
+continue;
+  }
+  break;

Review comment:
   Ok.

##
File path: hplsql/src/main/java/org/apache/hive/hplsql/Var.java
##
@@ -26,13 +26,14 @@
 
 import org.apache.hive.hplsql.executor.QueryResult;
 
+

Review comment:
   removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578301)
Time Spent: 40m  (was: 0.5h)

> Add Table type to HPL/SQL
> -
>
> Key: HIVE-24383
> URL: https://issues.apache.org/jira/browse/HIVE-24383
> Project: Hive
>  Issue Type: Sub-task
>  Components: hpl/sql
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24383) Add Table type to HPL/SQL

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24383?focusedWorklogId=578302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578302
 ]

ASF GitHub Bot logged work on HIVE-24383:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 11:40
Start Date: 07/Apr/21 11:40
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #2130:
URL: https://github.com/apache/hive/pull/2130#discussion_r608575837



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/objects/HplClass.java
##
@@ -0,0 +1,24 @@
+/*
+ *  Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hive.hplsql.objects;
+
+public interface HplClass {
+  HplObject instantiate();

Review comment:
   That makes more sense. I renamed it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578302)
Time Spent: 50m  (was: 40m)

> Add Table type to HPL/SQL
> -
>
> Key: HIVE-24383
> URL: https://issues.apache.org/jira/browse/HIVE-24383
> Project: Hive
>  Issue Type: Sub-task
>  Components: hpl/sql
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24383) Add Table type to HPL/SQL

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24383?focusedWorklogId=578300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578300
 ]

ASF GitHub Bot logged work on HIVE-24383:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 11:40
Start Date: 07/Apr/21 11:40
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #2130:
URL: https://github.com/apache/hive/pull/2130#discussion_r608575321



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/Select.java
##
@@ -87,37 +94,38 @@ public Integer select(HplsqlParser.Select_stmtContext ctx) {
 trace(ctx, "SELECT completed successfully");
 exec.setSqlSuccess();
 try {
-  int into_cnt = getIntoCount(ctx);
-  if (into_cnt > 0) {
-trace(ctx, "SELECT INTO statement executed");
-if (query.next()) {
-  for (int i = 0; i < into_cnt; i++) {
-String into_name = getIntoVariable(ctx, i);
-Var var = exec.findVariable(into_name);
-if (var != null) {
-  if (var.type != Var.Type.ROW) {
-var.setValue(query, i);
-  } else {
-var.setValues(query);
-  }
-  if (trace) {
-exec.trace(ctx, var, query.metadata(), i);
-  }
-} 
-else {
-  trace(ctx, "Variable not found: " + into_name);
+  int intoCount = getIntoCount(ctx);
+  if (intoCount > 0) {
+if (isBulkCollect(ctx)) {
+  trace(ctx, "SELECT BULK COLLECT INTO statement executed");
+  long rowIndex = 1;
+  List tables = exec.intoTables(ctx, intoVariableNames(ctx, 
intoCount));
+  tables.forEach(Table::removeAll);
+  while (query.next()) {
+for (int i = 0; i < intoCount; i++) {
+  Table table = tables.get(i);
+  table.populate(query, rowIndex, i);
 }
+rowIndex++;
+  }
+} else {
+  trace(ctx, "SELECT INTO statement executed");
+  if (query.next()) {
+for (int i = 0; i < intoCount; i++) {
+  populateVariable(ctx, query, i);
+}
+exec.incRowCount();
+exec.setSqlSuccess();
+if (query.next()) {
+  exec.setSqlCode(-1422);

Review comment:
   I added a new class for that.

##
File path: hplsql/src/main/java/org/apache/hive/hplsql/Select.java
##
@@ -155,7 +162,25 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 query.close();
 return 0; 
-  }  
+  }
+
+  private void populateVariable(HplsqlParser.Select_stmtContext ctx, 
QueryResult query, int columnIndex) {
+String intoName = getIntoVariable(ctx, columnIndex);
+Var var = exec.findVariable(intoName);
+if (var != null) {
+  if (var.type == Var.Type.HPL_OBJECT && var.value instanceof Table) {
+Table table = (Table) var.value;
+table.populate(query, getIntoTableIndex(ctx, columnIndex), 
columnIndex);
+  } else if (var.type == Var.Type.ROW) {
+var.setRowValues(query);
+  } else {
+var.setValue(query, columnIndex);
+  }
+  exec.trace(ctx, var, query.metadata(), columnIndex);
+} else {
+  trace(ctx, "Variable not found: " + intoName);

Review comment:
   I replaced it with an exception.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578300)
Time Spent: 0.5h  (was: 20m)

> Add Table type to HPL/SQL
> -
>
> Key: HIVE-24383
> URL: https://issues.apache.org/jira/browse/HIVE-24383
> Project: Hive
>  Issue Type: Sub-task
>  Components: hpl/sql
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24980) Add timeout for failed and "not initiated" compaction cleanup

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24980?focusedWorklogId=578280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578280
 ]

ASF GitHub Bot logged work on HIVE-24980:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 10:56
Start Date: 07/Apr/21 10:56
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2156:
URL: https://github.com/apache/hive/pull/2156#discussion_r608546997



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
##
@@ -891,29 +903,42 @@ public void purgeCompactionHistory() throws MetaException 
{
 ResultSet rs = null;
 List deleteSet = new ArrayList<>();
 RetentionCounters rc = null;
+long timeoutThreshold = System.currentTimeMillis() -
+MetastoreConf.getTimeVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_TIMEOUT, TimeUnit.MILLISECONDS);
 try {
   try {
 dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
 stmt = dbConn.createStatement();
 /* cc_id is monotonically increasing so for any entity sorts in order 
of compaction history,
 thus this query groups by entity and withing group sorts most recent 
first */
-rs = stmt.executeQuery("SELECT \"CC_ID\", \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\", \"CC_STATE\" "
-+ "FROM \"COMPLETED_COMPACTIONS\" ORDER BY \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\", \"CC_ID\" DESC");
+rs = stmt.executeQuery("SELECT \"CC_ID\", \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\", "
++ "\"CC_STATE\" , \"CC_START\", \"CC_TYPE\" "
++ "FROM \"COMPLETED_COMPACTIONS\" ORDER BY \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\"," +
+"\"CC_ID\" DESC");
 String lastCompactedEntity = null;
 /* In each group, walk from most recent and count occurrences of each 
state type.  Once you
 * have counted enough (for each state) to satisfy retention policy, 
delete all other
-* instances of this status. */
+* instances of this status.
+* Also, "not initiated" and "failed" compactions are cleaned up if 
they are older than
+* metastore.compactor.history.retention.timeout and there is a newer 
"succeeded"
+* compaction on the table and either (1) the "succeeded" compaction is 
major or (2) it is minor
+* and the "not initiated" or "failed" compaction is also minor –– so a 
minor succeeded compaction
+* will not cause the deletion of a major "not initiated" or "failed" 
compaction.
+*/
 while(rs.next()) {
   CompactionInfo ci = new CompactionInfo(
   rs.getLong(1), rs.getString(2), rs.getString(3),
   rs.getString(4), rs.getString(5).charAt(0));
+  ci.start = rs.getLong(6);
+  ci.type = 
TxnHandler.dbCompactionType2ThriftType(rs.getString(7).charAt(0));
   if(!ci.getFullPartitionName().equals(lastCompactedEntity)) {
 lastCompactedEntity = ci.getFullPartitionName();
-rc = new RetentionCounters(MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_DID_NOT_INITIATE),
-  getFailedCompactionRetention(),
-  MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_SUCCEEDED));
+rc = new RetentionCounters(
+MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_DID_NOT_INITIATE),

Review comment:
   Could you please extract conf value access (incl. 
getFailedCompactionRetention) out of the while loop.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578280)
Time Spent: 40m  (was: 0.5h)

> Add timeout for failed and "not initiated" compaction cleanup
> -
>
> Key: HIVE-24980
> URL: https://issues.apache.org/jira/browse/HIVE-24980
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Clear failed and not initiated compactions from COMPLETED_COMPACTIONS that 
> are older than a week (configurable) if there already is a newer successful 
> compaction on the table/partition and either (1) the succeeded compaction is 
> major or (2) it is minor and the not initiated or failed compaction is also 
> minor –– so a minor succeeded compaction will not 

[jira] [Work logged] (HIVE-24980) Add timeout for failed and "not initiated" compaction cleanup

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24980?focusedWorklogId=578279=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578279
 ]

ASF GitHub Bot logged work on HIVE-24980:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 10:52
Start Date: 07/Apr/21 10:52
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2156:
URL: https://github.com/apache/hive/pull/2156#discussion_r608546997



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
##
@@ -891,29 +903,42 @@ public void purgeCompactionHistory() throws MetaException 
{
 ResultSet rs = null;
 List deleteSet = new ArrayList<>();
 RetentionCounters rc = null;
+long timeoutThreshold = System.currentTimeMillis() -
+MetastoreConf.getTimeVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_TIMEOUT, TimeUnit.MILLISECONDS);
 try {
   try {
 dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
 stmt = dbConn.createStatement();
 /* cc_id is monotonically increasing so for any entity sorts in order 
of compaction history,
 thus this query groups by entity and withing group sorts most recent 
first */
-rs = stmt.executeQuery("SELECT \"CC_ID\", \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\", \"CC_STATE\" "
-+ "FROM \"COMPLETED_COMPACTIONS\" ORDER BY \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\", \"CC_ID\" DESC");
+rs = stmt.executeQuery("SELECT \"CC_ID\", \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\", "
++ "\"CC_STATE\" , \"CC_START\", \"CC_TYPE\" "
++ "FROM \"COMPLETED_COMPACTIONS\" ORDER BY \"CC_DATABASE\", 
\"CC_TABLE\", \"CC_PARTITION\"," +
+"\"CC_ID\" DESC");
 String lastCompactedEntity = null;
 /* In each group, walk from most recent and count occurrences of each 
state type.  Once you
 * have counted enough (for each state) to satisfy retention policy, 
delete all other
-* instances of this status. */
+* instances of this status.
+* Also, "not initiated" and "failed" compactions are cleaned up if 
they are older than
+* metastore.compactor.history.retention.timeout and there is a newer 
"succeeded"
+* compaction on the table and either (1) the "succeeded" compaction is 
major or (2) it is minor
+* and the "not initiated" or "failed" compaction is also minor –– so a 
minor succeeded compaction
+* will not cause the deletion of a major "not initiated" or "failed" 
compaction.
+*/
 while(rs.next()) {
   CompactionInfo ci = new CompactionInfo(
   rs.getLong(1), rs.getString(2), rs.getString(3),
   rs.getString(4), rs.getString(5).charAt(0));
+  ci.start = rs.getLong(6);
+  ci.type = 
TxnHandler.dbCompactionType2ThriftType(rs.getString(7).charAt(0));
   if(!ci.getFullPartitionName().equals(lastCompactedEntity)) {
 lastCompactedEntity = ci.getFullPartitionName();
-rc = new RetentionCounters(MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_DID_NOT_INITIATE),
-  getFailedCompactionRetention(),
-  MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_SUCCEEDED));
+rc = new RetentionCounters(
+MetastoreConf.getIntVar(conf, 
ConfVars.COMPACTOR_HISTORY_RETENTION_DID_NOT_INITIATE),

Review comment:
   Could you please extract conf value access out of the while loop.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578279)
Time Spent: 0.5h  (was: 20m)

> Add timeout for failed and "not initiated" compaction cleanup
> -
>
> Key: HIVE-24980
> URL: https://issues.apache.org/jira/browse/HIVE-24980
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Clear failed and not initiated compactions from COMPLETED_COMPACTIONS that 
> are older than a week (configurable) if there already is a newer successful 
> compaction on the table/partition and either (1) the succeeded compaction is 
> major or (2) it is minor and the not initiated or failed compaction is also 
> minor –– so a minor succeeded compaction will not cause the deletion of a 
> major not 

[jira] [Work logged] (HIVE-24980) Add timeout for failed and "not initiated" compaction cleanup

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24980?focusedWorklogId=578270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578270
 ]

ASF GitHub Bot logged work on HIVE-24980:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 10:41
Start Date: 07/Apr/21 10:41
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2156:
URL: https://github.com/apache/hive/pull/2156#discussion_r608539863



##
File path: 
ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
##
@@ -340,6 +341,121 @@ private void addFailedCompaction(String dbName, String 
tableName, CompactionType
 txnHandler.markFailed(ci);
   }
 
+  private void addDidNotInitiateCompaction(String dbName, String tableName, 
String partitionName,
+  CompactionType type, String errorMessage) throws MetaException {
+CompactionInfo ci = new CompactionInfo(dbName, tableName, partitionName, 
type);
+ci.errorMessage = errorMessage;
+ci.id = 0;
+txnHandler.markFailed(ci);
+  }
+
+  private void addSucceededCompaction(String dbName, String tableName, String 
partitionName,
+  CompactionType type) throws Exception {
+CompactionRequest rqst = new CompactionRequest(dbName, tableName, type);
+rqst.setPartitionname(partitionName);
+txnHandler.compact(rqst);
+assertEquals(0, txnHandler.findReadyToClean(0, 0).size());
+CompactionInfo ci = txnHandler.findNextToCompact("fred", WORKER_VERSION);
+assertNotNull(ci);
+txnHandler.markCompacted(ci);
+txnHandler.markCleaned(ci);
+  }
+
+  @Test
+  public void testPurgeCompactionHistory() throws Exception {
+MetastoreConf.setLongVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_HISTORY_RETENTION_SUCCEEDED, 2);
+MetastoreConf.setLongVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_HISTORY_RETENTION_DID_NOT_INITIATE, 2);
+MetastoreConf.setLongVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_HISTORY_RETENTION_FAILED, 2);
+txnHandler.setConf(conf);
+
+String dbName = "default";
+String tableName = "tpch";
+String part1 = "(p=1)";
+String part2 = "(p=2)";
+
+// 3 successful compactions on p=1
+addSucceededCompaction(dbName, tableName, part1, CompactionType.MAJOR);
+addSucceededCompaction(dbName, tableName, part1, CompactionType.MAJOR);
+addSucceededCompaction(dbName, tableName, part1, CompactionType.MAJOR);
+
+// 3 failed on p=1
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part1, 
"message");
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part1, 
"message");
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part1, 
"message");
+//4 failed on p=2
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part2, 
"message");
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part2, 
"message");
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part2, 
"message");
+addFailedCompaction(dbName, tableName, CompactionType.MAJOR, part2, 
"message");
+
+// 3 not initiated on p=1
+addDidNotInitiateCompaction(dbName, tableName, part1, 
CompactionType.MAJOR, "message");
+addDidNotInitiateCompaction(dbName, tableName, part1, 
CompactionType.MAJOR, "message");
+addDidNotInitiateCompaction(dbName, tableName, part1, 
CompactionType.MAJOR, "message");
+
+countCompactionsInHistory(13);
+
+txnHandler.purgeCompactionHistory();
+
+countCompactionsInHistory(8);

Review comment:
   should we check the status and tbl/part for the remained compactions 
here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578270)
Time Spent: 20m  (was: 10m)

> Add timeout for failed and "not initiated" compaction cleanup
> -
>
> Key: HIVE-24980
> URL: https://issues.apache.org/jira/browse/HIVE-24980
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Clear failed and not initiated compactions from COMPLETED_COMPACTIONS that 
> are older than a week (configurable) if there already is a newer successful 
> compaction on the table/partition and either (1) the succeeded compaction is 
> major or (2) it is minor and the not initiated or failed compaction is also 
> minor –– so a minor succeeded compaction will not cause the deletion of a 
> major not initiated or failed 

[jira] [Work logged] (HIVE-24985) Create new metrics about locks

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?focusedWorklogId=578263=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578263
 ]

ASF GitHub Bot logged work on HIVE-24985:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 10:21
Start Date: 07/Apr/21 10:21
Worklog Time Spent: 10m 
  Work Description: deniskuzZ opened a new pull request #2158:
URL: https://github.com/apache/hive/pull/2158


   
   
   ### What changes were proposed in this pull request?
   
   Introduced new metrics about locks
   
   ### Why are the changes needed?
   
   Compaction observability
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Unit test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578263)
Remaining Estimate: 0h
Time Spent: 10m

> Create new metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24985) Create new metrics about locks

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24985:
--
Labels: pull-request-available  (was: )

> Create new metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-24985) Create new metrics about locks

2021-04-07 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24985 started by Denys Kuzmenko.
-
> Create new metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24985) Create new metrics about locks

2021-04-07 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-24985:
-

Assignee: Denys Kuzmenko

> Create new metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24985) Create new metrics about locks

2021-04-07 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-24985:
--
Summary: Create new metrics about locks  (was: Create metrics about locks)

> Create new metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23571?focusedWorklogId=578252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578252
 ]

ASF GitHub Bot logged work on HIVE-23571:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 09:56
Start Date: 07/Apr/21 09:56
Worklog Time Spent: 10m 
  Work Description: adesh-rao commented on a change in pull request #2128:
URL: https://github.com/apache/hive/pull/2128#discussion_r608511275



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -2476,6 +2499,25 @@ public boolean isTableConstraintValid(String catName, 
String dbName, String tblN
 return isValid;
   }
 
+  public boolean isTableCacheStale(String catName, String dbName, String 
tblName, String validWriteIdList) {
+boolean isValid = false;
+
+if (StringUtils.isEmpty(validWriteIdList))
+  return false;

Review comment:
   nit: add braces.
   
   Also, should we return true? if validWriteId is not present, we can safely 
assume that 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578252)
Time Spent: 20m  (was: 10m)

> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24906) Suffix the table location with UUID/txnId

2021-04-07 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-24906:
-

Assignee: Denys Kuzmenko

> Suffix the table location with UUID/txnId
> -
>
> Key: HIVE-24906
> URL: https://issues.apache.org/jira/browse/HIVE-24906
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Suffixing the table location during create table with UUID/txnId can help in 
> deleting the data in asynchronous fashion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24985) Create metrics about locks

2021-04-07 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-24985:
--
Description: 
Basic metrics that can help investigate.
Ideas:
*  number of locks
* oldest lock's age

> Create metrics about locks
> --
>
> Key: HIVE-24985
> URL: https://issues.apache.org/jira/browse/HIVE-24985
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Priority: Major
>
> Basic metrics that can help investigate.
> Ideas:
> *  number of locks
> * oldest lock's age



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24851) resources leak on exception in AvroGenericRecordReader constructor

2021-04-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24851?focusedWorklogId=578162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578162
 ]

ASF GitHub Bot logged work on HIVE-24851:
-

Author: ASF GitHub Bot
Created on: 07/Apr/21 06:24
Start Date: 07/Apr/21 06:24
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2129:
URL: https://github.com/apache/hive/pull/2129#issuecomment-814638355


   > @pvary can you please help diagnose the build results?
   
   I would go to the failure list, and ignore the ones which are failing in 
both patches. I see 1 fixed (most important to check), and 2 new failures (less 
important but I would check anyway). If the failures are flaky or green on 
local. Then I would push again my changes and hope for a "green" run.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 578162)
Time Spent: 7h 20m  (was: 7h 10m)

> resources leak on exception in AvroGenericRecordReader constructor
> --
>
> Key: HIVE-24851
> URL: https://issues.apache.org/jira/browse/HIVE-24851
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Lukasz Osipiuk
>Assignee: Lukasz Osipiuk
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0, 4.0.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> AvroGenericRecordReader constructor creates an instance of FileReader but 
> lacks proper exception handling, and reader is not closed on the failure path.
> This results in leaking of underlying resources (e.g. S3 connections).
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)