[jira] [Updated] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-23493:
--
Attachment: (was: HIVE-23493.1.patch)

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  ,year_total t_c_secyear
>  ,year_total t_w_firstyear
>  ,year_total t_w_secyear
>  where t_s_secyear.customer_id = t_s_firstyear.customer_id
>and t_s_firstyear.customer_id = t_c_secyear.customer_id
>and t_s_firstyear.customer_id = t_c_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_secyear.customer_id
>and t_s_firstyear.sale_type = 's'
>and t_c_firstyear.sale_type = 'c'
>and t_w_firstyear.sale_type = 'w'
>and t_s_secyear.sale_type = 's'
>and t_c_secyear.sale_type = 'c'
>and t_w_secyear.sale_type = 'w'
>and t_s_firstyear.dyear =  1999
>and t_s_secyear.dyear = 1999+1
>and t_c_firstyear.dyear =  1999
>

[jira] [Updated] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23467:
--
Labels: pull-request-available  (was: )

> Add a skip.trash config for HMS to skip trash when deleting external table 
> data
> ---
>
> Key: HIVE-23467
> URL: https://issues.apache.org/jira/browse/HIVE-23467
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Sam An
>Assignee: Yu-Wen Lai
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have an auto.purge flag, which means skip trash. It can be confusing as we 
> have 'external.table.purge'='true' to indicate delete table data when this 
> tblproperties is set. 
> We should make the meaning clearer by introducing a skip trash alias/option. 
> Additionally, we shall add an alias for external.table.purge, and name it 
> external.table.autodelete, and document it more prominently, so as to 
> maintain backward compatibility, and make the meaning of auto deletion of 
> data more obvious. 
> The net effect of these 2 changes will be. If the user sets 
> 'external.table.autodelete'='true'
> the table data will be removed when table is dropped. and if 
> 'skip.trash'='true' 
> is set, HMS will not move the table data to trash folder when removing the 
> files. This will result in faster removal, especially when underlying FS is 
> S3. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23467?focusedWorklogId=447076=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447076
 ]

ASF GitHub Bot logged work on HIVE-23467:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 05:41
Start Date: 17/Jun/20 05:41
Worklog Time Spent: 10m 
  Work Description: hsnusonic opened a new pull request #1133:
URL: https://github.com/apache/hive/pull/1133


   …ng external table data
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447076)
Remaining Estimate: 0h
Time Spent: 10m

> Add a skip.trash config for HMS to skip trash when deleting external table 
> data
> ---
>
> Key: HIVE-23467
> URL: https://issues.apache.org/jira/browse/HIVE-23467
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Sam An
>Assignee: Yu-Wen Lai
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have an auto.purge flag, which means skip trash. It can be confusing as we 
> have 'external.table.purge'='true' to indicate delete table data when this 
> tblproperties is set. 
> We should make the meaning clearer by introducing a skip trash alias/option. 
> Additionally, we shall add an alias for external.table.purge, and name it 
> external.table.autodelete, and document it more prominently, so as to 
> maintain backward compatibility, and make the meaning of auto deletion of 
> data more obvious. 
> The net effect of these 2 changes will be. If the user sets 
> 'external.table.autodelete'='true'
> the table data will be removed when table is dropped. and if 
> 'skip.trash'='true' 
> is set, HMS will not move the table data to trash folder when removing the 
> files. This will result in faster removal, especially when underlying FS is 
> S3. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-23493.
---
Resolution: Fixed

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  ,year_total t_c_secyear
>  ,year_total t_w_firstyear
>  ,year_total t_w_secyear
>  where t_s_secyear.customer_id = t_s_firstyear.customer_id
>and t_s_firstyear.customer_id = t_c_secyear.customer_id
>and t_s_firstyear.customer_id = t_c_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_secyear.customer_id
>and t_s_firstyear.sale_type = 's'
>and t_c_firstyear.sale_type = 'c'
>and t_w_firstyear.sale_type = 'w'
>and t_s_secyear.sale_type = 's'
>and t_c_secyear.sale_type = 'c'
>and t_w_secyear.sale_type = 'w'
>and t_s_firstyear.dyear =  1999
>and t_s_secyear.dyear = 1999+1
>and 

[jira] [Reopened] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reopened HIVE-23493:
---

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  ,year_total t_c_secyear
>  ,year_total t_w_firstyear
>  ,year_total t_w_secyear
>  where t_s_secyear.customer_id = t_s_firstyear.customer_id
>and t_s_firstyear.customer_id = t_c_secyear.customer_id
>and t_s_firstyear.customer_id = t_c_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_secyear.customer_id
>and t_s_firstyear.sale_type = 's'
>and t_c_firstyear.sale_type = 'c'
>and t_w_firstyear.sale_type = 'w'
>and t_s_secyear.sale_type = 's'
>and t_c_secyear.sale_type = 'c'
>and t_w_secyear.sale_type = 'w'
>and t_s_firstyear.dyear =  1999
>and t_s_secyear.dyear = 1999+1
>and t_c_firstyear.dyear =  1999
>

[jira] [Work logged] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?focusedWorklogId=447075=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447075
 ]

ASF GitHub Bot logged work on HIVE-23493:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 05:25
Start Date: 17/Jun/20 05:25
Worklog Time Spent: 10m 
  Work Description: kasakrisz closed pull request #1132:
URL: https://github.com/apache/hive/pull/1132


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447075)
Time Spent: 2h  (was: 1h 50m)

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  

[jira] [Work logged] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?focusedWorklogId=447071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447071
 ]

ASF GitHub Bot logged work on HIVE-23493:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 05:22
Start Date: 17/Jun/20 05:22
Worklog Time Spent: 10m 
  Work Description: kasakrisz closed pull request #1096:
URL: https://github.com/apache/hive/pull/1096


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447071)
Time Spent: 1h 50m  (was: 1h 40m)

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total 

[jira] [Commented] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread Krisztian Kasa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138111#comment-17138111
 ] 

Krisztian Kasa commented on HIVE-23493:
---

Pushed to master. Thank you [~jcamachorodriguez] for review.

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  ,year_total t_c_secyear
>  ,year_total t_w_firstyear
>  ,year_total t_w_secyear
>  where t_s_secyear.customer_id = t_s_firstyear.customer_id
>and t_s_firstyear.customer_id = t_c_secyear.customer_id
>and t_s_firstyear.customer_id = t_c_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_secyear.customer_id
>and t_s_firstyear.sale_type = 's'
>and t_c_firstyear.sale_type = 'c'
>and t_w_firstyear.sale_type = 'w'
>and t_s_secyear.sale_type = 's'
>and t_c_secyear.sale_type = 'c'
>and t_w_secyear.sale_type = 'w'
>and 

[jira] [Updated] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-23493:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  ,year_total t_c_secyear
>  ,year_total t_w_firstyear
>  ,year_total t_w_secyear
>  where t_s_secyear.customer_id = t_s_firstyear.customer_id
>and t_s_firstyear.customer_id = t_c_secyear.customer_id
>and t_s_firstyear.customer_id = t_c_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_firstyear.customer_id
>and t_s_firstyear.customer_id = t_w_secyear.customer_id
>and t_s_firstyear.sale_type = 's'
>and t_c_firstyear.sale_type = 'c'
>and t_w_firstyear.sale_type = 'w'
>and t_s_secyear.sale_type = 's'
>and t_c_secyear.sale_type = 'c'
>and t_w_secyear.sale_type = 'w'
>and t_s_firstyear.dyear =  1999
>

[jira] [Commented] (HIVE-23707) Unable to create materialized views with transactions enabled with MySQL metastore

2020-06-16 Thread Dustin Koupal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138110#comment-17138110
 ] 

Dustin Koupal commented on HIVE-23707:
--

I tried to track this down, but wasn't very successful.  The best I can come up 
with is, should this be VARCHAR?

 

[https://github.com/apache/hive/blob/871ee8009380e1bab160b58dc378a7f668c64584/standalone-metastore/metastore-server/src/main/resources/package.jdo#L256]

 

I'm wondering if it's a similar issue to this:

 

[https://github.com/apache/hive/commit/5861b6af52839794c18f5aa686c24aabdb737b93]

> Unable to create materialized views with transactions enabled with MySQL 
> metastore
> --
>
> Key: HIVE-23707
> URL: https://issues.apache.org/jira/browse/HIVE-23707
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
>Reporter: Dustin Koupal
>Priority: Blocker
>
> When attempting to create a materialized view with transactions enabled, we 
> get the following exception:
>  
> {code:java}
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Failed to 
> generate new Mapping of type 
> org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
> CLOB declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore.ERROR : FAILED: 
> Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
> MetaException(message:Failed to generate new Mapping of type 
> org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
> CLOB declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore.JDBC type CLOB 
> declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this 
> datastore.org.datanucleus.exceptions.NucleusException: JDBC type CLOB 
> declared for field 
> "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java 
> type java.lang.String cant be mapped for this datastore. at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getDatastoreMappingClass(RDBMSMappingManager.java:1386)
>  at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.createDatastoreMapping(RDBMSMappingManager.java:1616)
>  at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.prepareDatastoreMapping(SingleFieldMapping.java:59)
>  at 
> org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.initialize(SingleFieldMapping.java:48)
>  at 
> org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getMapping(RDBMSMappingManager.java:482)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.manageMembers(ClassTable.java:536)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.manageClass(ClassTable.java:442) 
> at 
> org.datanucleus.store.rdbms.table.ClassTable.initializeForClass(ClassTable.java:1270)
>  at 
> org.datanucleus.store.rdbms.table.ClassTable.initialize(ClassTable.java:276) 
> at 
> org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.initializeClassTables(RDBMSStoreManager.java:3279)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2889)
>  at 
> org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1627)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:672)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.getPropertiesForGenerator(RDBMSStoreManager.java:2088)
>  at 
> org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1271)
>  at 
> org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3760)
>  at 
> org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:2267)
>  at 
> org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:484)
>  at 
> org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:120)
>  at 
> org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:218)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2079)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
>  at 
> 

[jira] [Updated] (HIVE-23707) Unable to create materialized views with transactions enabled with MySQL metastore

2020-06-16 Thread Dustin Koupal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Koupal updated HIVE-23707:
-
Description: 
When attempting to create a materialized view with transactions enabled, we get 
the following exception:

 
{code:java}
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Failed to 
generate new Mapping of type 
org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
CLOB declared for field 
"org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java type 
java.lang.String cant be mapped for this datastore.ERROR : FAILED: Execution 
Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:Failed to generate new Mapping of type 
org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type 
CLOB declared for field 
"org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of java type 
java.lang.String cant be mapped for this datastore.JDBC type CLOB declared for 
field "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of 
java type java.lang.String cant be mapped for this 
datastore.org.datanucleus.exceptions.NucleusException: JDBC type CLOB declared 
for field "org.apache.hadoop.hive.metastore.model.MCreationMetadata.txnList" of 
java type java.lang.String cant be mapped for this datastore. at 
org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getDatastoreMappingClass(RDBMSMappingManager.java:1386)
 at 
org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.createDatastoreMapping(RDBMSMappingManager.java:1616)
 at 
org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.prepareDatastoreMapping(SingleFieldMapping.java:59)
 at 
org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.initialize(SingleFieldMapping.java:48)
 at 
org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getMapping(RDBMSMappingManager.java:482)
 at 
org.datanucleus.store.rdbms.table.ClassTable.manageMembers(ClassTable.java:536) 
at 
org.datanucleus.store.rdbms.table.ClassTable.manageClass(ClassTable.java:442) 
at 
org.datanucleus.store.rdbms.table.ClassTable.initializeForClass(ClassTable.java:1270)
 at 
org.datanucleus.store.rdbms.table.ClassTable.initialize(ClassTable.java:276) at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.initializeClassTables(RDBMSStoreManager.java:3279)
 at 
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2889)
 at 
org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:119)
 at 
org.datanucleus.store.rdbms.RDBMSStoreManager.manageClasses(RDBMSStoreManager.java:1627)
 at 
org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:672)
 at 
org.datanucleus.store.rdbms.RDBMSStoreManager.getPropertiesForGenerator(RDBMSStoreManager.java:2088)
 at 
org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1271)
 at 
org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3760)
 at 
org.datanucleus.state.StateManagerImpl.setIdentity(StateManagerImpl.java:2267) 
at 
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:484)
 at 
org.datanucleus.state.StateManagerImpl.initialiseForPersistentNew(StateManagerImpl.java:120)
 at 
org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:218)
 at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2079)
 at 
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
 at 
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
 at 
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
 at 
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724)
 at 
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
 at 
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:1308) 
at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source) at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) at 
com.sun.proxy.$Proxy25.createTable(Unknown Source) at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1882)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1786)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:2035)
 at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source) at 

[jira] [Work logged] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?focusedWorklogId=447066=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447066
 ]

ASF GitHub Bot logged work on HIVE-23493:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 05:08
Start Date: 17/Jun/20 05:08
Worklog Time Spent: 10m 
  Work Description: kasakrisz opened a new pull request #1132:
URL: https://github.com/apache/hive/pull/1132


   fix commit mesage



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447066)
Time Spent: 1h 40m  (was: 1.5h)

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  

[jira] [Work logged] (HIVE-23493) Rewrite plan to join back tables with many projected columns joined multiple times

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23493?focusedWorklogId=447061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447061
 ]

ASF GitHub Bot logged work on HIVE-23493:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 05:00
Start Date: 17/Jun/20 05:00
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #1124:
URL: https://github.com/apache/hive/pull/1124


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447061)
Time Spent: 1.5h  (was: 1h 20m)

> Rewrite plan to join back tables with many projected columns joined multiple 
> times
> --
>
> Key: HIVE-23493
> URL: https://issues.apache.org/jira/browse/HIVE-23493
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23493.1.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Queries with a pattern where one or more tables joins with a fact table in a 
> CTE. Many columns are projected out those tables and then grouped in the CTE. 
>  The main query joins multiple instances of the CTE and may project a subset 
> of these.
> The optimization is to rewrite the CTE to include only key (PK, non null 
> Unique Key) columns and join the tables back to the resultset of the main 
> query to fetch the rest of the wide columns. This reduces the datasize of the 
> joined back tables that is broadcast/shuffled throughout the DAG processing.
> Example query, tpc-ds query4
> {code}
> with year_total as (
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2)
>  year_total
>,'s' sale_type
>  from customer
>  ,store_sales
>  ,date_dim
>  where c_customer_sk = ss_customer_sk
>and ss_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumcs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2)
>  ) year_total
>,'c' sale_type
>  from customer
>  ,catalog_sales
>  ,date_dim
>  where c_customer_sk = cs_bill_customer_sk
>and cs_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
> union all
>  select c_customer_id customer_id
>,c_first_name customer_first_name
>,c_last_name customer_last_name
>,c_preferred_cust_flag customer_preferred_cust_flag
>,c_birth_country customer_birth_country
>,c_login customer_login
>,c_email_address customer_email_address
>,d_year dyear
>
> ,sumws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2)
>  ) year_total
>,'w' sale_type
>  from customer
>  ,web_sales
>  ,date_dim
>  where c_customer_sk = ws_bill_customer_sk
>and ws_sold_date_sk = d_date_sk
>  group by c_customer_id
>  ,c_first_name
>  ,c_last_name
>  ,c_preferred_cust_flag
>  ,c_birth_country
>  ,c_login
>  ,c_email_address
>  ,d_year
>  )
>   select  
>   t_s_secyear.customer_id
>  ,t_s_secyear.customer_first_name
>  ,t_s_secyear.customer_last_name
>  ,t_s_secyear.customer_birth_country
>  from year_total t_s_firstyear
>  ,year_total t_s_secyear
>  ,year_total t_c_firstyear
>  

[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447044
 ]

ASF GitHub Bot logged work on HIVE-23706:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 03:55
Start Date: 17/Jun/20 03:55
Worklog Time Spent: 10m 
  Work Description: kasakrisz opened a new pull request #1131:
URL: https://github.com/apache/hive/pull/1131


   Testing done:
   ```
   mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver -Dqfile=order_null.q -pl itests/qtest -Pitests
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447044)
Remaining Estimate: 0h
Time Spent: 10m

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1, null, 3, 2, 2, 2)
> SELECT a FROM t ORDER BY a DESC NULLS FIRST
> {code}
> should return 
> {code}
> 3
> 2
> 2
> 2
> 1
> null
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23706) Fix nulls first sorting behavior

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23706:
--
Labels: pull-request-available  (was: )

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1, null, 3, 2, 2, 2)
> SELECT a FROM t ORDER BY a DESC NULLS FIRST
> {code}
> should return 
> {code}
> 3
> 2
> 2
> 2
> 1
> null
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23706) Fix nulls first sorting behavior

2020-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-23706:
--
Description: 
{code}
INSERT INTO t(a) VALUES (1, null, 3, 2, 2, 2)

SELECT a FROM t ORDER BY a DESC NULLS FIRST
{code}
should return 
{code}
3
2
2
2
1
null
{code}

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
>
> {code}
> INSERT INTO t(a) VALUES (1, null, 3, 2, 2, 2)
> SELECT a FROM t ORDER BY a DESC NULLS FIRST
> {code}
> should return 
> {code}
> 3
> 2
> 2
> 2
> 1
> null
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23706) Fix nulls first sorting behavior

2020-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-23706:
-


> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23162) Remove swapping logic to merge joins in AST converter

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23162?focusedWorklogId=446981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446981
 ]

ASF GitHub Bot logged work on HIVE-23162:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 00:23
Start Date: 17/Jun/20 00:23
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #978:
URL: https://github.com/apache/hive/pull/978


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446981)
Time Spent: 0.5h  (was: 20m)

> Remove swapping logic to merge joins in AST converter
> -
>
> Key: HIVE-23162
> URL: https://issues.apache.org/jira/browse/HIVE-23162
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23162.01.patch, HIVE-23162.02.patch, 
> HIVE-23162.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In ASTConverter, there is some logic to invert join inputs so the logic to 
> merge joins in SemanticAnalyzer kicks in.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java#L407
> There is a bug because inputs are swapped but the schema is not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=446913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446913
 ]

ASF GitHub Bot logged work on HIVE-23704:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 21:13
Start Date: 16/Jun/20 21:13
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #1127:
URL: https://github.com/apache/hive/pull/1127


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446913)
Time Spent: 0.5h  (was: 20m)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=446911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446911
 ]

ASF GitHub Bot logged work on HIVE-23704:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 21:12
Start Date: 16/Jun/20 21:12
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #1127:
URL: https://github.com/apache/hive/pull/1127


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446911)
Time Spent: 20m  (was: 10m)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20419?focusedWorklogId=446842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446842
 ]

ASF GitHub Bot logged work on HIVE-20419:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 19:22
Start Date: 16/Jun/20 19:22
Worklog Time Spent: 10m 
  Work Description: frankzyt commented on pull request #518:
URL: https://github.com/apache/hive/pull/518#issuecomment-644964996


   too much bot to mail



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446842)
Time Spent: 0.5h  (was: 20m)

> Vectorization: Prevent mutation of VectorPartitionDesc after being used in a 
> hashmap key
> 
>
> Key: HIVE-20419
> URL: https://issues.apache.org/jira/browse/HIVE-20419
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal Vijayaraghavan
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20419.1.patch, HIVE-20419.2.patch, 
> HIVE-20419.4.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is going into the loop because the VectorPartitionDesc is modified after 
> it is used in the HashMap key - resulting in a hashcode & equals modification 
> after it has been placed in the hashmap.
> {code}
> HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: 
> 621ms
> java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869  <7 
> recursive calls>
> java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, 
> Object) HashMap.java:1989
> java.util.HashMap.putVal(int, Object, Object, boolean, boolean) 
> HashMap.java:637
> java.util.HashMap.put(Object, Object) HashMap.java:611
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc,
>  VectorPartitionDesc, Map) Vectorizer.java:1272
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc,
>  boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork,
>  String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) 
> Vectorizer.java:1654
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork,
>  Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork,
>  boolean) Vectorizer.java:1109
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node,
>  Stack, Object[]) Vectorizer.java:961
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, 
> TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) 
> TaskGraphWalker.java:180
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, 
> HashMap) TaskGraphWalker.java:125
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext)
>  Vectorizer.java:2442
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, 
> ParseContext, Context) TezCompiler.java:717
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, 
> HashSet, HashSet) TaskCompiler.java:258
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, 
> SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) 
> CalcitePlanner.java:358
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data

2020-06-16 Thread Sam An (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An reassigned HIVE-23467:
-

Assignee: Yu-Wen Lai  (was: Sam An)

> Add a skip.trash config for HMS to skip trash when deleting external table 
> data
> ---
>
> Key: HIVE-23467
> URL: https://issues.apache.org/jira/browse/HIVE-23467
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Sam An
>Assignee: Yu-Wen Lai
>Priority: Trivial
>
> We have an auto.purge flag, which means skip trash. It can be confusing as we 
> have 'external.table.purge'='true' to indicate delete table data when this 
> tblproperties is set. 
> We should make the meaning clearer by introducing a skip trash alias/option. 
> Additionally, we shall add an alias for external.table.purge, and name it 
> external.table.autodelete, and document it more prominently, so as to 
> maintain backward compatibility, and make the meaning of auto deletion of 
> data more obvious. 
> The net effect of these 2 changes will be. If the user sets 
> 'external.table.autodelete'='true'
> the table data will be removed when table is dropped. and if 
> 'skip.trash'='true' 
> is set, HMS will not move the table data to trash folder when removing the 
> files. This will result in faster removal, especially when underlying FS is 
> S3. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21709) Count with expression does not work in Parquet

2020-06-16 Thread Mainak Ghosh (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137837#comment-17137837
 ] 

Mainak Ghosh commented on HIVE-21709:
-

Oh wow, this just fell through the cracks. Yes I would love to have this pushed 
but unfortunately I have not worked on Hive for some time. I have created the 
PR against master [https://github.com/apache/hive/pull/1130].

> Count with expression does not work in Parquet
> --
>
> Key: HIVE-21709
> URL: https://issues.apache.org/jira/browse/HIVE-21709
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.2
>Reporter: Mainak Ghosh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For parquet file with nested schema, count with expression as column name 
> does not work when you are filtering on another column in the same struct. 
> Here are the steps to reproduce:
> {code:java}
> CREATE TABLE `test_table`( `rtb_win` struct<`impression_id`:string, 
> `pub_id`:string>) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS 
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';
> INSERT INTO TABLE test_table SELECT named_struct('impression_id', 'cat', 
> 'pub_id', '2');
> select count(rtb_win.impression_id) from test_table where rtb_win.pub_id ='2';
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. spark, 
> tez) or using Hive 1.X releases.
> +--+ 
> | _c0  |
> +--+ 
> | 0    | 
> +--+
> select count(*) from test_parquet_count_mghosh where rtb_win.pub_id ='2';
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. spark, 
> tez) or using Hive 1.X releases. 
> +--+ 
> | _c0  | 
> +--+ 
> | 1    | 
> +--+{code}
> As you can see the first query returns the wrong result while the second one 
> returns the correct result.
> The issue is an column order mismatch between the actual parquet file 
> (impression_id first and pub_id second) and the Hive prunedCols datastructure 
> (reverse). As a result in the filter we compare with the wrong value and the 
> count returns 0. I have been able to identify the cause of this mismatch.
> I would love to get the code reviewed and merged. Some of the code changes 
> are changes to commits from Ferdinand Xu and Chao Sun.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23705) Add tests for 'external.table.purge' and 'auto.purge'

2020-06-16 Thread Yu-Wen Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu-Wen Lai reassigned HIVE-23705:
-


> Add tests for 'external.table.purge' and 'auto.purge'
> -
>
> Key: HIVE-23705
> URL: https://issues.apache.org/jira/browse/HIVE-23705
> Project: Hive
>  Issue Type: Test
>  Components: Standalone Metastore
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>
> The current unit tests did not include an external table with setting 
> 'external.table.purge' or 'auto.purge'. It should be added into 
> TestTableCreateDropAlterTruncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-12698:
--
Labels: pull-request-available  (was: )

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch, HIVE-12698.2.patch, 
> HIVE-12698.3.patch, HIVE-12698.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-12698?focusedWorklogId=446804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446804
 ]

ASF GitHub Bot logged work on HIVE-12698:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 18:01
Start Date: 16/Jun/20 18:01
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #58:
URL: https://github.com/apache/hive/pull/58#issuecomment-644920721


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446804)
Remaining Estimate: 0h
Time Spent: 10m

> Remove exposure to internal privilege and principal classes in HiveAuthorizer
> -
>
> Key: HIVE-12698
> URL: https://issues.apache.org/jira/browse/HIVE-12698
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12698.1.patch, HIVE-12698.2.patch, 
> HIVE-12698.3.patch, HIVE-12698.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The changes in HIVE-11179 expose several internal classes to 
> HiveAuthorization implementations. These include PrivilegeObjectDesc, 
> PrivilegeDesc, PrincipalDesc and AuthorizationUtils.
> We should avoid exposing that to all Authorization implementations, but also 
> make the ability to customize the mapping of internal classes to the public 
> api classes possible for Apache Sentry (incubating).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23603) transformDatabase() should work with changes from HIVE-22995

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23603?focusedWorklogId=446800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446800
 ]

ASF GitHub Bot logged work on HIVE-23603:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 18:00
Start Date: 16/Jun/20 18:00
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1115:
URL: https://github.com/apache/hive/pull/1115#issuecomment-644919899


   recheck



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446800)
Time Spent: 20m  (was: 10m)

> transformDatabase() should work with changes from HIVE-22995
> 
>
> Key: HIVE-23603
> URL: https://issues.apache.org/jira/browse/HIVE-23603
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23603.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The translation layer alters the locationUri on Database based on the 
> capabilities of the client. Now that we have separate locations for managed 
> and external for database, the implementation should be adjusted to work with 
> both locations. locationUri could already be external location.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-16 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23668:
---
Attachment: HIVE-23668.02.patch
Status: Patch Available  (was: In Progress)

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23668.01.patch, HIVE-23668.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-16 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23668:
---
Status: In Progress  (was: Patch Available)

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23668.01.patch, HIVE-23668.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23668:
--
Labels: pull-request-available  (was: )

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23668.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-16 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23668 started by Aasha Medhi.
--
> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23668.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-16 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23668:
---
Attachment: HIVE-23668.01.patch
Status: Patch Available  (was: In Progress)

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23668.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?focusedWorklogId=446786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446786
 ]

ASF GitHub Bot logged work on HIVE-23668:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 17:40
Start Date: 16/Jun/20 17:40
Worklog Time Spent: 10m 
  Work Description: aasha opened a new pull request #1129:
URL: https://github.com/apache/hive/pull/1129


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446786)
Remaining Estimate: 0h
Time Spent: 10m

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23668.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-17258) Incorrect log messages in the Hive.java

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17258?focusedWorklogId=446779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446779
 ]

ASF GitHub Bot logged work on HIVE-17258:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 17:30
Start Date: 16/Jun/20 17:30
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #222:
URL: https://github.com/apache/hive/pull/222


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446779)
Time Spent: 20m  (was: 10m)

> Incorrect log messages in the Hive.java
> ---
>
> Key: HIVE-17258
> URL: https://issues.apache.org/jira/browse/HIVE-17258
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17258.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There are couple of typos when using LOG methods in the 
> org.apache.hadoop.hive.ql.metadata.Hive class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13745?focusedWorklogId=446748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446748
 ]

ASF GitHub Bot logged work on HIVE-13745:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:58
Start Date: 16/Jun/20 16:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #76:
URL: https://github.com/apache/hive/pull/76#issuecomment-644888491


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446748)
Remaining Estimate: 0h
Time Spent: 10m

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Biao Wu
>Assignee: Biao Wu
>Priority: Major
> Attachments: HIVE-13745.1.patch, HIVE-13745.2-branch-2.patch, 
> HIVE-13745.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-14759) GenericUDF.getFuncName breaks with UDF Classnames less than 10 characters

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-14759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-14759:
--
Labels: pull-request-available  (was: )

> GenericUDF.getFuncName breaks with UDF Classnames less than 10 characters
> -
>
> Key: HIVE-14759
> URL: https://issues.apache.org/jira/browse/HIVE-14759
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Clemens Valiente
>Assignee: Clemens Valiente
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-14759.1.patch, HIVE-14759.2.patch, HIVE-14759.patch
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> {code}
> return getClass().getSimpleName().substring(10).toLowerCase();
> {code}
> causes
> {code}
> java.lang.StringIndexOutOfBoundsException: String index out of range: -2
> at java.lang.String.substring(String.java:1875)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.getFuncName(GenericUDF.java:258)
> {code}
> if the Classname of my UDF is less than 10 characters.
> this was probably to remove "GenericUDF" from the classname but causes issues 
> if the class doesn't start with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-13745) UDF current_date、current_timestamp、unix_timestamp NPE

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-13745:
--
Labels: pull-request-available  (was: )

> UDF current_date、current_timestamp、unix_timestamp NPE
> -
>
> Key: HIVE-13745
> URL: https://issues.apache.org/jira/browse/HIVE-13745
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Biao Wu
>Assignee: Biao Wu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13745.1.patch, HIVE-13745.2-branch-2.patch, 
> HIVE-13745.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> NullPointerException when current_date is used in mapreduce



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-13545) Add GLOBAL Type to Entity

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-13545:
--
Labels: pull-request-available  (was: )

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13545.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-14759) GenericUDF.getFuncName breaks with UDF Classnames less than 10 characters

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-14759?focusedWorklogId=446747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446747
 ]

ASF GitHub Bot logged work on HIVE-14759:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:58
Start Date: 16/Jun/20 16:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #101:
URL: https://github.com/apache/hive/pull/101#issuecomment-644888350


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446747)
Remaining Estimate: 50m  (was: 1h)
Time Spent: 10m

> GenericUDF.getFuncName breaks with UDF Classnames less than 10 characters
> -
>
> Key: HIVE-14759
> URL: https://issues.apache.org/jira/browse/HIVE-14759
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Clemens Valiente
>Assignee: Clemens Valiente
>Priority: Trivial
> Attachments: HIVE-14759.1.patch, HIVE-14759.2.patch, HIVE-14759.patch
>
>   Original Estimate: 1h
>  Time Spent: 10m
>  Remaining Estimate: 50m
>
> {code}
> return getClass().getSimpleName().substring(10).toLowerCase();
> {code}
> causes
> {code}
> java.lang.StringIndexOutOfBoundsException: String index out of range: -2
> at java.lang.String.substring(String.java:1875)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.getFuncName(GenericUDF.java:258)
> {code}
> if the Classname of my UDF is less than 10 characters.
> this was probably to remove "GenericUDF" from the classname but causes issues 
> if the class doesn't start with it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-13545) Add GLOBAL Type to Entity

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13545?focusedWorklogId=446746=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446746
 ]

ASF GitHub Bot logged work on HIVE-13545:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:58
Start Date: 16/Jun/20 16:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #73:
URL: https://github.com/apache/hive/pull/73#issuecomment-644888517


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446746)
Remaining Estimate: 0h
Time Spent: 10m

> Add GLOBAL Type to Entity
> -
>
> Key: HIVE-13545
> URL: https://issues.apache.org/jira/browse/HIVE-13545
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Major
> Attachments: HIVE-13545.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java}} don't have the 
> {{GLOBAL}} type, it should be matched with 
> {{org.apache.hadoop.hive.ql.security.authorization.plugin.HivePrivilegeObject.HivePrivilegeObjectType}}.
>  At the same time, we should enable the custom converting from Entity to 
> HivePrivilegeObject



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-9660?focusedWorklogId=446741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446741
 ]

ASF GitHub Bot logged work on HIVE-9660:


Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #77:
URL: https://github.com/apache/hive/pull/77#issuecomment-644888481


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446741)
Remaining Estimate: 0h
Time Spent: 10m

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, 
> HIVE-9660.03.patch, HIVE-9660.04.patch, HIVE-9660.05.patch, 
> HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.07.patch, 
> HIVE-9660.08.patch, HIVE-9660.09.patch, HIVE-9660.10.patch, 
> HIVE-9660.10.patch, HIVE-9660.11.patch, HIVE-9660.patch, HIVE-9660.patch, 
> HIVE-9660.patch, owen-hive-9660.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15705) Event replication for constraints

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15705?focusedWorklogId=446734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446734
 ]

ASF GitHub Bot logged work on HIVE-15705:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #219:
URL: https://github.com/apache/hive/pull/219#issuecomment-644888073


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446734)
Remaining Estimate: 0h
Time Spent: 10m

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, 
> HIVE-15705.6.patch, HIVE-15705.7.patch, HIVE-15705.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-13879) add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13879?focusedWorklogId=446742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446742
 ]

ASF GitHub Bot logged work on HIVE-13879:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #88:
URL: https://github.com/apache/hive/pull/88#issuecomment-644888425


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446742)
Remaining Estimate: 0h
Time Spent: 10m

> add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api
> --
>
> Key: HIVE-13879
> URL: https://issues.apache.org/jira/browse/HIVE-13879
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
> Attachments: HIVE-13879.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HiveAuthzContext provides useful information about the context of the 
> commands, such as the command string and ip address information. However, 
> this is available to only checkPrivileges and filterListCmdObjects api calls.
> This should be made available for other api calls such as grant/revoke 
> methods and role management methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-13170) HiveAccumuloTableOutputFormat should implement HiveOutputFormat to ensure compatibility

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13170?focusedWorklogId=446739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446739
 ]

ASF GitHub Bot logged work on HIVE-13170:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #66:
URL: https://github.com/apache/hive/pull/66#issuecomment-644888550


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446739)
Remaining Estimate: 0h
Time Spent: 10m

> HiveAccumuloTableOutputFormat should implement HiveOutputFormat to ensure 
> compatibility
> ---
>
> Key: HIVE-13170
> URL: https://issues.apache.org/jira/browse/HIVE-13170
> Project: Hive
>  Issue Type: Bug
>  Components: Accumulo Storage Handler
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Teng Qiu
>Assignee: Teng Qiu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> this issue was caused by same reason described in 
> https://issues.apache.org/jira/browse/HIVE-11166
> both HiveAccumuloTableOutputFormat and HiveHBaseTableOutputFormat does not 
> implemented HiveOutputFormat, it may break the compatibility in some other 
> APIs that are using hive, such as spark's API.
> spark expects the OutputFormat called by hive storage handler is some kind of 
> HiveOutputFormat. which is totally reasonable.
> and since they are OutputFormat for hive storage handler, they should not 
> only extend the 3rd party OutputFormat (AccumuloOutputFormat or 
> hbase.TableOutputFormat), but also implement HiveOutputFormat interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15690) Speed up WebHCat DDL Response Time by Using JDBC

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15690?focusedWorklogId=446736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446736
 ]

ASF GitHub Bot logged work on HIVE-15690:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #133:
URL: https://github.com/apache/hive/pull/133#issuecomment-644888263


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446736)
Remaining Estimate: 23h 50m  (was: 24h)
Time Spent: 10m

> Speed up WebHCat DDL Response Time by Using JDBC
> 
>
> Key: HIVE-15690
> URL: https://issues.apache.org/jira/browse/HIVE-15690
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Reporter: Amin Abbaspour
>Assignee: Amin Abbaspour
>Priority: Minor
>  Labels: easyfix, patch, performance, security
>   Original Estimate: 24h
>  Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> WebHCat launches new hcat scripts for each DDL call which makes it unsuitable 
> for interactive REST environments.
> This change to speed up /ddl query calls by running them over JDBC connection 
> to Hive thrift server.
> Also being JDBC connection, this is secure and compatible with all access 
> policies define in Hive server2. User does not have metadata visibility over 
> other databases (which is the case in hcat mode.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15848) count or sum distinct incorrect when hive.optimize.reducededuplication set to true

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15848?focusedWorklogId=446732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446732
 ]

ASF GitHub Bot logged work on HIVE-15848:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #150:
URL: https://github.com/apache/hive/pull/150#issuecomment-644888225


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446732)
Remaining Estimate: 0h
Time Spent: 10m

> count or sum distinct incorrect when hive.optimize.reducededuplication set to 
> true
> --
>
> Key: HIVE-15848
> URL: https://issues.apache.org/jira/browse/HIVE-15848
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Biao Wu
>Assignee: Zoltan Haindrich
>Priority: Critical
> Fix For: 2.3.0
>
> Attachments: HIVE-15848.1.patch, HIVE-15848.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Test Table:
> {code:sql}
> create table test(id int,key int,name int);
> {code}
> Data:
> ||id||key||name||
> |1|1  |2
> |1|2  |3
> |1|3  |2
> |1|4  |2
> |1|5  |3
> Test SQL1:
> {code:sql}
> select id,count(Distinct key),count(Distinct name)
> from (select id,key,name from count_distinct_test group by id,key,name)m
> group by id;
> {code}
> result:
> |1|5|4
> expect:
> |1|5|2
> Test SQL2:
> {code:sql}
> select id,count(Distinct name),count(Distinct key)
> from (select id,key,name from count_distinct_test group by id,name,key)m
> group by id;
> {code}
> result:
> |1|2|5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-13170) HiveAccumuloTableOutputFormat should implement HiveOutputFormat to ensure compatibility

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-13170:
--
Labels: pull-request-available  (was: )

> HiveAccumuloTableOutputFormat should implement HiveOutputFormat to ensure 
> compatibility
> ---
>
> Key: HIVE-13170
> URL: https://issues.apache.org/jira/browse/HIVE-13170
> Project: Hive
>  Issue Type: Bug
>  Components: Accumulo Storage Handler
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Teng Qiu
>Assignee: Teng Qiu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> this issue was caused by same reason described in 
> https://issues.apache.org/jira/browse/HIVE-11166
> both HiveAccumuloTableOutputFormat and HiveHBaseTableOutputFormat does not 
> implemented HiveOutputFormat, it may break the compatibility in some other 
> APIs that are using hive, such as spark's API.
> spark expects the OutputFormat called by hive storage handler is some kind of 
> HiveOutputFormat. which is totally reasonable.
> and since they are OutputFormat for hive storage handler, they should not 
> only extend the 3rd party OutputFormat (AccumuloOutputFormat or 
> hbase.TableOutputFormat), but also implement HiveOutputFormat interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-14585) Add travis.yml and update README to show build status

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-14585?focusedWorklogId=446737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446737
 ]

ASF GitHub Bot logged work on HIVE-14585:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #97:
URL: https://github.com/apache/hive/pull/97#issuecomment-644888362


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446737)
Remaining Estimate: 0h
Time Spent: 10m

> Add travis.yml and update README to show build status
> -
>
> Key: HIVE-14585
> URL: https://issues.apache.org/jira/browse/HIVE-14585
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: HIVE-14585.1.patch, HIVE-14585.2.patch, 
> HIVE-14585.3.patch, HIVE-14585.4.patch, HIVE-14585.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Travis CI is free to use for all open source projects. To start off with we 
> can just run the builds and show the status on github page. In future, we can 
> leverage the tests and explore parallel testing.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-14483:
--
Labels: pull-request-available  (was: )

>  java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
> --
>
> Key: HIVE-14483
> URL: https://issues.apache.org/jira/browse/HIVE-14483
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Sergey Zadoroshnyak
>Assignee: Sergey Zadoroshnyak
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.3.0, 2.0.2, 2.1.1, 2.2.0
>
> Attachments: HIVE-14483.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Error message:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
> at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
> at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 22 more
> How to reproduce?
> Configure StringTreeReader  which contains StringDirectTreeReader as 
> TreeReader (DIRECT or DIRECT_V2 column encoding)
> batchSize = 1026;
> invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final 
> int batchSize)
> scratchlcv is LongColumnVector with long[] vector  (length 1024)
>  which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, 
> scratchlcv,result, batchSize);
> as result in method commonReadByteArrays(stream, lengths, scratchlcv,
> result, (int) batchSize) we received 
> ArrayIndexOutOfBoundsException.
> If we use StringDictionaryTreeReader, then there is no exception, as we have 
> a verification  scratchlcv.ensureSize((int) batchSize, false) before 
> reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);
> These changes were made for Hive 2.1.0 by corresponding commit 
> https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467
>  for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley
> How to fix?
> add  only one line :
> scratchlcv.ensureSize((int) batchSize, false) ;
> in method 
> org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream
>  stream, IntegerReader lengths,
> LongColumnVector scratchlcv,
> BytesColumnVector result, final int batchSize) before invocation 
> lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-17258) Incorrect log messages in the Hive.java

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17258:
--
Labels: pull-request-available  (was: )

> Incorrect log messages in the Hive.java
> ---
>
> Key: HIVE-17258
> URL: https://issues.apache.org/jira/browse/HIVE-17258
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17258.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are couple of typos when using LOG methods in the 
> org.apache.hadoop.hive.ql.metadata.Hive class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-13879) add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-13879:
--
Labels: pull-request-available  (was: )

> add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api
> --
>
> Key: HIVE-13879
> URL: https://issues.apache.org/jira/browse/HIVE-13879
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13879.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HiveAuthzContext provides useful information about the context of the 
> commands, such as the command string and ip address information. However, 
> this is available to only checkPrivileges and filterListCmdObjects api calls.
> This should be made available for other api calls such as grant/revoke 
> methods and role management methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15690) Speed up WebHCat DDL Response Time by Using JDBC

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15690:
--
Labels: easyfix patch performance pull-request-available security  (was: 
easyfix patch performance security)

> Speed up WebHCat DDL Response Time by Using JDBC
> 
>
> Key: HIVE-15690
> URL: https://issues.apache.org/jira/browse/HIVE-15690
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Reporter: Amin Abbaspour
>Assignee: Amin Abbaspour
>Priority: Minor
>  Labels: easyfix, patch, performance, pull-request-available, 
> security
>   Original Estimate: 24h
>  Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> WebHCat launches new hcat scripts for each DDL call which makes it unsuitable 
> for interactive REST environments.
> This change to speed up /ddl query calls by running them over JDBC connection 
> to Hive thrift server.
> Also being JDBC connection, this is secure and compatible with all access 
> policies define in Hive server2. User does not have metadata visibility over 
> other databases (which is the case in hcat mode.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-17258) Incorrect log messages in the Hive.java

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17258?focusedWorklogId=446733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446733
 ]

ASF GitHub Bot logged work on HIVE-17258:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #222:
URL: https://github.com/apache/hive/pull/222#issuecomment-644888054


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446733)
Remaining Estimate: 0h
Time Spent: 10m

> Incorrect log messages in the Hive.java
> ---
>
> Key: HIVE-17258
> URL: https://issues.apache.org/jira/browse/HIVE-17258
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Trivial
> Fix For: 3.0.0
>
> Attachments: HIVE-17258.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are couple of typos when using LOG methods in the 
> org.apache.hadoop.hive.ql.metadata.Hive class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15705) Event replication for constraints

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15705:
--
Labels: pull-request-available  (was: )

> Event replication for constraints
> -
>
> Key: HIVE-15705
> URL: https://issues.apache.org/jira/browse/HIVE-15705
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-15705.1.patch, HIVE-15705.2.patch, 
> HIVE-15705.3.patch, HIVE-15705.4.patch, HIVE-15705.5.patch, 
> HIVE-15705.6.patch, HIVE-15705.7.patch, HIVE-15705.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Make event replication for primary key and foreign key work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-14483?focusedWorklogId=446744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446744
 ]

ASF GitHub Bot logged work on HIVE-14483:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #96:
URL: https://github.com/apache/hive/pull/96#issuecomment-644888376


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446744)
Remaining Estimate: 0h
Time Spent: 10m

>  java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
> --
>
> Key: HIVE-14483
> URL: https://issues.apache.org/jira/browse/HIVE-14483
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Sergey Zadoroshnyak
>Assignee: Sergey Zadoroshnyak
>Priority: Critical
> Fix For: 1.3.0, 2.0.2, 2.1.1, 2.2.0
>
> Attachments: HIVE-14483.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Error message:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
> at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
> at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 22 more
> How to reproduce?
> Configure StringTreeReader  which contains StringDirectTreeReader as 
> TreeReader (DIRECT or DIRECT_V2 column encoding)
> batchSize = 1026;
> invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final 
> int batchSize)
> scratchlcv is LongColumnVector with long[] vector  (length 1024)
>  which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, 
> scratchlcv,result, batchSize);
> as result in method commonReadByteArrays(stream, lengths, scratchlcv,
> result, (int) batchSize) we received 
> ArrayIndexOutOfBoundsException.
> If we use StringDictionaryTreeReader, then there is no exception, as we have 
> a verification  scratchlcv.ensureSize((int) batchSize, false) before 
> reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);
> These changes were made for Hive 2.1.0 by corresponding commit 
> https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467
>  for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley
> How to fix?
> add  only one line :
> scratchlcv.ensureSize((int) batchSize, false) ;
> in method 
> org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream
>  stream, IntegerReader lengths,
> LongColumnVector scratchlcv,
> BytesColumnVector result, final int batchSize) before invocation 
> lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-13539:
--
Labels: pull-request-available  (was: )

> HiveHFileOutputFormat searching the wrong directory for HFiles
> --
>
> Key: HIVE-13539
> URL: https://issues.apache.org/jira/browse/HIVE-13539
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0
> Environment: Built into CDH 5.4.7
>Reporter: Tim Robertson
>Assignee: Chaoyu Tang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-13539.1.patch, HIVE-13539.patch, 
> hive_hfile_output_format.q, hive_hfile_output_format.q.out
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When creating HFiles for a bulkload in HBase I believe it is looking in the 
> wrong directory to find the HFiles, resulting in the following exception:
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
>   ... 7 more
> Caused by: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
>   ... 11 more
> {code}
> The issue is that is looks for the HFiles in 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}}
>  when I believe it should be looking in the task attempt subfolder, such as 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}.
> This can be reproduced in any HFile creation such as:
> {code:sql}
> CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
>   'hbase.columns.mapping' = ':key,o:x,o:y',
>   'hbase.table.default.storage.type' = 'binary');
> SET hfile.family.path=/tmp/coords_hfiles/o; 
> SET hive.hbase.generatehfiles=true;
> INSERT OVERWRITE TABLE coords_hbase 
> SELECT id, decimalLongitude, decimalLatitude
> FROM source
> CLUSTER BY id; 
> {code}
> Any advice greatly appreciated



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-11741) Add a new hook to run before query parse/compile

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-11741:
--
Labels: patch pull-request-available  (was: patch)

> Add a new hook to run before query parse/compile
> 
>
> Key: HIVE-11741
> URL: https://issues.apache.org/jira/browse/HIVE-11741
> Project: Hive
>  Issue Type: New Feature
>  Components: Parser, SQL
>Affects Versions: 1.2.1
>Reporter: Guilherme Braccialli
>Assignee: Guilherme Braccialli
>Priority: Minor
>  Labels: patch, pull-request-available
> Attachments: HIVE-11741.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It would be nice to allow developers to extend hive query language, making 
> possible to use custom wildcards on queries. 
> People uses Python or R to iterate over vectors or lists and create SQL 
> commands, this could be implemented directly on sql syntax.
> For example this python script:
> >>> sql = "SELECT state, "
> >>> for i in range(10):
> ...   sql += "   sum(case when type = " + str(i) + " then value end) as 
> sum_of_" + str(i) + " ,"
> ...
> >>> sql += " count(1) as  total FROM table"
> >>> print(sql)
> Could be written directly in extended sql like this:
> SELECT state,
> %for id = 1 to 10%
>sum(case when type = %id% then value end) as sum_of_%id%,
> %end%
> , count(1) as total
> FROM table
> GROUP BY state
> This kind of extensibility can be easily added if we add a new hook after 
> VariableSubstitution call on Driver.compile method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13539?focusedWorklogId=446738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446738
 ]

ASF GitHub Bot logged work on HIVE-13539:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #74:
URL: https://github.com/apache/hive/pull/74#issuecomment-644888500


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446738)
Remaining Estimate: 0h
Time Spent: 10m

> HiveHFileOutputFormat searching the wrong directory for HFiles
> --
>
> Key: HIVE-13539
> URL: https://issues.apache.org/jira/browse/HIVE-13539
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 1.1.0
> Environment: Built into CDH 5.4.7
>Reporter: Tim Robertson
>Assignee: Chaoyu Tang
>Priority: Blocker
> Fix For: 2.1.1, 2.2.0
>
> Attachments: HIVE-13539.1.patch, HIVE-13539.patch, 
> hive_hfile_output_format.q, hive_hfile_output_format.q.out
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When creating HFiles for a bulkload in HBase I believe it is looking in the 
> wrong directory to find the HFiles, resulting in the following exception:
> {code}
> Error: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
>   ... 7 more
> Caused by: java.io.IOException: Multiple family directories found in 
> hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
>   at 
> org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
>   ... 11 more
> {code}
> The issue is that is looks for the HFiles in 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}}
>  when I believe it should be looking in the task attempt subfolder, such as 
> {{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}.
> This can be reproduced in any HFile creation such as:
> {code:sql}
> CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
>   'hbase.columns.mapping' = ':key,o:x,o:y',
>   'hbase.table.default.storage.type' = 'binary');
> SET hfile.family.path=/tmp/coords_hfiles/o; 
> SET hive.hbase.generatehfiles=true;
> INSERT OVERWRITE TABLE coords_hbase 
> SELECT id, decimalLongitude, decimalLatitude
> FROM source
> CLUSTER BY id; 
> {code}
> Any advice greatly 

[jira] [Work logged] (HIVE-13877) Hplsql UDF doesn't work in Hive Cli

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13877?focusedWorklogId=446740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446740
 ]

ASF GitHub Bot logged work on HIVE-13877:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #78:
URL: https://github.com/apache/hive/pull/78#issuecomment-644888466


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446740)
Remaining Estimate: 0h
Time Spent: 10m

> Hplsql UDF doesn't work in Hive Cli
> ---
>
> Key: HIVE-13877
> URL: https://issues.apache.org/jira/browse/HIVE-13877
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 2.2.0
>Reporter: jiangxintong
>Assignee: Dmitry Tolpeko
>Priority: Major
> Attachments: HIVE-13877.2.patch, HIVE-13877.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive cli will throw "Error evaluating hplsql" exception when i use the hplsql 
> udf like "SELECT hplsql('hello[:1]', columnName) FROM tableName".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15848) count or sum distinct incorrect when hive.optimize.reducededuplication set to true

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15848:
--
Labels: pull-request-available  (was: )

> count or sum distinct incorrect when hive.optimize.reducededuplication set to 
> true
> --
>
> Key: HIVE-15848
> URL: https://issues.apache.org/jira/browse/HIVE-15848
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0
>Reporter: Biao Wu
>Assignee: Zoltan Haindrich
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.3.0
>
> Attachments: HIVE-15848.1.patch, HIVE-15848.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Test Table:
> {code:sql}
> create table test(id int,key int,name int);
> {code}
> Data:
> ||id||key||name||
> |1|1  |2
> |1|2  |3
> |1|3  |2
> |1|4  |2
> |1|5  |3
> Test SQL1:
> {code:sql}
> select id,count(Distinct key),count(Distinct name)
> from (select id,key,name from count_distinct_test group by id,key,name)m
> group by id;
> {code}
> result:
> |1|5|4
> expect:
> |1|5|2
> Test SQL2:
> {code:sql}
> select id,count(Distinct name),count(Distinct key)
> from (select id,key,name from count_distinct_test group by id,name,key)m
> group by id;
> {code}
> result:
> |1|2|5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-14585) Add travis.yml and update README to show build status

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-14585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-14585:
--
Labels: pull-request-available  (was: )

> Add travis.yml and update README to show build status
> -
>
> Key: HIVE-14585
> URL: https://issues.apache.org/jira/browse/HIVE-14585
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.3.0
>
> Attachments: HIVE-14585.1.patch, HIVE-14585.2.patch, 
> HIVE-14585.3.patch, HIVE-14585.4.patch, HIVE-14585.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Travis CI is free to use for all open source projects. To start off with we 
> can just run the builds and show the status on github page. In future, we can 
> leverage the tests and explore parallel testing.
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-9660:
-
Labels: pull-request-available  (was: )

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-9660.01.patch, HIVE-9660.02.patch, 
> HIVE-9660.03.patch, HIVE-9660.04.patch, HIVE-9660.05.patch, 
> HIVE-9660.06.patch, HIVE-9660.07.patch, HIVE-9660.07.patch, 
> HIVE-9660.08.patch, HIVE-9660.09.patch, HIVE-9660.10.patch, 
> HIVE-9660.10.patch, HIVE-9660.11.patch, HIVE-9660.patch, HIVE-9660.patch, 
> HIVE-9660.patch, owen-hive-9660.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-11741) Add a new hook to run before query parse/compile

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-11741?focusedWorklogId=446743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446743
 ]

ASF GitHub Bot logged work on HIVE-11741:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:57
Start Date: 16/Jun/20 16:57
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #63:
URL: https://github.com/apache/hive/pull/63#issuecomment-644888564


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446743)
Remaining Estimate: 0h
Time Spent: 10m

> Add a new hook to run before query parse/compile
> 
>
> Key: HIVE-11741
> URL: https://issues.apache.org/jira/browse/HIVE-11741
> Project: Hive
>  Issue Type: New Feature
>  Components: Parser, SQL
>Affects Versions: 1.2.1
>Reporter: Guilherme Braccialli
>Assignee: Guilherme Braccialli
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-11741.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It would be nice to allow developers to extend hive query language, making 
> possible to use custom wildcards on queries. 
> People uses Python or R to iterate over vectors or lists and create SQL 
> commands, this could be implemented directly on sql syntax.
> For example this python script:
> >>> sql = "SELECT state, "
> >>> for i in range(10):
> ...   sql += "   sum(case when type = " + str(i) + " then value end) as 
> sum_of_" + str(i) + " ,"
> ...
> >>> sql += " count(1) as  total FROM table"
> >>> print(sql)
> Could be written directly in extended sql like this:
> SELECT state,
> %for id = 1 to 10%
>sum(case when type = %id% then value end) as sum_of_%id%,
> %end%
> , count(1) as total
> FROM table
> GROUP BY state
> This kind of extensibility can be easily added if we add a new hook after 
> VariableSubstitution call on Driver.compile method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-13877) Hplsql UDF doesn't work in Hive Cli

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-13877:
--
Labels: pull-request-available  (was: )

> Hplsql UDF doesn't work in Hive Cli
> ---
>
> Key: HIVE-13877
> URL: https://issues.apache.org/jira/browse/HIVE-13877
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 2.2.0
>Reporter: jiangxintong
>Assignee: Dmitry Tolpeko
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-13877.2.patch, HIVE-13877.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive cli will throw "Error evaluating hplsql" exception when i use the hplsql 
> udf like "SELECT hplsql('hello[:1]', columnName) FROM tableName".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-17260) Typo: exception has been created and lost in the ThriftJDBCBinarySerDe

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17260?focusedWorklogId=446729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446729
 ]

ASF GitHub Bot logged work on HIVE-17260:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #224:
URL: https://github.com/apache/hive/pull/224#issuecomment-644888035


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446729)
Remaining Estimate: 0h
Time Spent: 10m

> Typo: exception has been created and lost in the ThriftJDBCBinarySerDe
> --
>
> Key: HIVE-17260
> URL: https://issues.apache.org/jira/browse/HIVE-17260
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17260.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Line 100:
> {code:java}
> } catch (Exception e) {
>   new SerDeException(e);
> }
> {code}
> Seems like it should be thrown there :-)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15442) Driver.java has a redundancy code

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15442:
--
Labels: pull-request-available  (was: )

> Driver.java has a redundancy  code
> --
>
> Key: HIVE-15442
> URL: https://issues.apache.org/jira/browse/HIVE-15442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-15442.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Driver.java has a  redundancy  code about "explain output", i think the if 
> statement " if (conf.getBoolVar(ConfVars.HIVE_LOG_EXPLAIN_OUTPUT))" has a 
> repeat judge with the above statement. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20911) External Table Replication for Hive

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20911?focusedWorklogId=446727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446727
 ]

ASF GitHub Bot logged work on HIVE-20911:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #506:
URL: https://github.com/apache/hive/pull/506


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446727)
Time Spent: 20m  (was: 10m)

> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, 
> HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, 
> HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch, 
> HIVE-20911.08.patch, HIVE-20911.08.patch, HIVE-20911.09.patch, 
> HIVE-20911.10.patch, HIVE-20911.11.patch, HIVE-20911.12.patch, 
> HIVE-20911.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster. This can be 
> provided using the following configuration:
> {code}
> hive.repl.replica.external.table.base.dir=/
> {code}
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> tableName,base64Encoded(tableDataLocation)
> {code}
> In case there are different partitions in the table pointing to different 
> locations there will be multiple entries in the file for the same table name 
> with location pointing to different partition locations. For partitions 
> created in a table without specifying the _set location_ command will be 
> within the same table Data location and hence there will not be different 
> entries in the file above 
> ** *repl load* will read the  *\_external\_tables\_info* to identify what 
> locations are to be copied from source to target and create corresponding 
> tasks for them.
> * New External tables will be created with metadata only with no data copied 
> as part of regular tasks while incremental load/bootstrap load.
> * Bootstrap dump will also create  *\_external\_tables\_info* which will be 
> used to copy data from source to target  as part of boostrap load.
> * Bootstrap load will create a DAG, that can use parallelism in the execution 
> phase, the hdfs copy related tasks are created, once the bootstrap phase is 
> complete.
> * Since incremental load results in a DAG with only sequential execution ( 
> events applied in sequence ) to effectively use the parallelism capability in 
> execution mode, we create tasks for hdfs copy along with the incremental DAG. 
> This requires a few basic calculations to approximately meet the configured 
> value in  "hive.repl.approx.max.load.tasks" 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15746) Fix default delimiter2 in str_to_map UDF or in method description

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15746:
--
Labels: pull-request-available  (was: )

> Fix default delimiter2 in str_to_map UDF or in method description
> -
>
> Key: HIVE-15746
> URL: https://issues.apache.org/jira/browse/HIVE-15746
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 2.3.0
>
> Attachments: HIVE-15746.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> According to UDF wiki and to GenericUDFStringToMap.java class comments 
> default delimiter 2 should be '='.
> But in the code default_del2 = ":"
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStringToMap.java#L53
> We need to fix code or fix the method description and UDF wiki
> Let me know what you think?
> {code}
> str_to_map("a=1,b=2")
> vs
> str_to_map("a:1,b:2")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21126) Allow session level queries in LlapBaseInputFormat#getSplits() before actual get_splits() call

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21126?focusedWorklogId=446721=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446721
 ]

ASF GitHub Bot logged work on HIVE-21126:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #515:
URL: https://github.com/apache/hive/pull/515


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446721)
Time Spent: 20m  (was: 10m)

> Allow session level queries in LlapBaseInputFormat#getSplits() before actual 
> get_splits() call
> --
>
> Key: HIVE-21126
> URL: https://issues.apache.org/jira/browse/HIVE-21126
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.1.1
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21126.1.patch, HIVE-21126.2.patch, 
> HIVE-21126.3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Facilitate execution of session level queries before \{{select get_splits()}} 
> call. This will allow us to set params like \{{tez.grouping.split-count}} 
> which can be taken into consideration while splits calculation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-16645) Commands.java has missed the catch statement and has some code format errors

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-16645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16645:
--
Labels: pull-request-available  (was: )

> Commands.java has missed the catch statement and has some code format errors
> 
>
> Key: HIVE-16645
> URL: https://issues.apache.org/jira/browse/HIVE-16645
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16645.1.patch, HIVE-16645.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In commands.java, the catch statement is missing and the Resultset statement 
> is not closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-16925) isSlowStart lost during refactoring

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-16925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16925:
--
Labels: pull-request-available  (was: )

> isSlowStart lost during refactoring
> ---
>
> Key: HIVE-16925
> URL: https://issues.apache.org/jira/browse/HIVE-16925
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-16925.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TezEdgeProperty.setAutoReduce() should have isSlowStart as parameter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15900?focusedWorklogId=446720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446720
 ]

ASF GitHub Bot logged work on HIVE-15900:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #148:
URL: https://github.com/apache/hive/pull/148#issuecomment-644888238


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446720)
Remaining Estimate: 0h
Time Spent: 10m

> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.2.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Thejas Nair
>Priority: Major
> Fix For: 2.2.0
>
> Attachments: HIVE-15900.1.patch, HIVE-15900.2.patch, 
> HIVE-15900.3.patch, HIVE-15900.3.patch, std_out
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20546) Upgrade to Apache Druid 0.13.0-incubating

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20546?focusedWorklogId=446731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446731
 ]

ASF GitHub Bot logged work on HIVE-20546:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #516:
URL: https://github.com/apache/hive/pull/516


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446731)
Time Spent: 20m  (was: 10m)

> Upgrade to Apache Druid 0.13.0-incubating
> -
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20546.1.patch, HIVE-20546.2.patch, 
> HIVE-20546.3.patch, HIVE-20546.4.patch, HIVE-20546.5.patch, 
> HIVE-20546.6.patch, HIVE-20546.7.patch, HIVE-20546.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-17038) invalid result when CAST-ing to DATE

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17038:
--
Labels: pull-request-available  (was: )

> invalid result when CAST-ing to DATE
> 
>
> Key: HIVE-17038
> URL: https://issues.apache.org/jira/browse/HIVE-17038
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> when casting incorrect date literals to DATE data type hive returns wrong 
> values instead of NULL.
> {code}
> SELECT CAST('2017-02-31' AS DATE);
> SELECT CAST('2017-04-31' AS DATE);
> {code}
> Some examples below where it really can produce weird results:
> {code}
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = '2017-06-31';
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = cast('2017-06-31' as date);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15423) Allowing Hive to reverse map IP from hostname for partition info

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15423?focusedWorklogId=446722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446722
 ]

ASF GitHub Bot logged work on HIVE-15423:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #122:
URL: https://github.com/apache/hive/pull/122#issuecomment-644888309


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446722)
Remaining Estimate: 0h
Time Spent: 10m

> Allowing Hive to reverse map IP from hostname for partition info
> 
>
> Key: HIVE-15423
> URL: https://issues.apache.org/jira/browse/HIVE-15423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Suresh Bahuguna
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive - Namenode hostname mismatch when running queries with 2 MR jobs.
> Hive tries to find Partition info using hdfs://:, 
> whereas the info has been hashed using hdfs://:.
> Exception raised in HiveFileFormatUtils.java:
> -
> java.io.IOException: cannot find dir = 
> hdfs://hd-nn-24:9000/tmp/hive-admin/hive_2013-08-30_06-11-52_007_1545561832334194535/-mr-10002/00_0
>  in pathToPartitionInfo: 
> [hdfs://192.168.156.24:9000/tmp/hive-admin/hive_2013-08-30_06-11-52_007_1545561832334194535/-mr-10002]
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java
> -



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-17038) invalid result when CAST-ing to DATE

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17038?focusedWorklogId=446726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446726
 ]

ASF GitHub Bot logged work on HIVE-17038:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #204:
URL: https://github.com/apache/hive/pull/204#issuecomment-644888107


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446726)
Remaining Estimate: 0h
Time Spent: 10m

> invalid result when CAST-ing to DATE
> 
>
> Key: HIVE-17038
> URL: https://issues.apache.org/jira/browse/HIVE-17038
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Hive
>Affects Versions: 1.2.1
>Reporter: Jim Hopper
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> when casting incorrect date literals to DATE data type hive returns wrong 
> values instead of NULL.
> {code}
> SELECT CAST('2017-02-31' AS DATE);
> SELECT CAST('2017-04-31' AS DATE);
> {code}
> Some examples below where it really can produce weird results:
> {code}
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = '2017-06-31';
> select *
>   from (
> select cast('2017-07-01' as date) as dt
> ) as t
> where t.dt = cast('2017-06-31' as date);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-16497?focusedWorklogId=446719=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446719
 ]

ASF GitHub Bot logged work on HIVE-16497:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #171:
URL: https://github.com/apache/hive/pull/171#issuecomment-644888176


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446719)
Remaining Estimate: 0h
Time Spent: 10m

> FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file 
> system operations should be impersonated
> --
>
> Key: HIVE-16497
> URL: https://issues.apache.org/jira/browse/HIVE-16497
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> FileUtils.isActionPermittedForFileHierarchy checks if user has permissions 
> for given action. The checks are made by impersonating the user.
> However, the listing of child dirs are done as the hiveserver2 user. If the 
> hive user doesn't have permissions on the filesystem, it gives incorrect 
> error that the user doesn't have permissions to perform the action.
> Impersonating the end user for all file operations in that function is also 
> logically correct thing to do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15746) Fix default delimiter2 in str_to_map UDF or in method description

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15746?focusedWorklogId=446725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446725
 ]

ASF GitHub Bot logged work on HIVE-15746:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #140:
URL: https://github.com/apache/hive/pull/140#issuecomment-644888248


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446725)
Remaining Estimate: 0h
Time Spent: 10m

> Fix default delimiter2 in str_to_map UDF or in method description
> -
>
> Key: HIVE-15746
> URL: https://issues.apache.org/jira/browse/HIVE-15746
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.1.1
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
>Priority: Trivial
> Fix For: 2.3.0
>
> Attachments: HIVE-15746.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> According to UDF wiki and to GenericUDFStringToMap.java class comments 
> default delimiter 2 should be '='.
> But in the code default_del2 = ":"
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStringToMap.java#L53
> We need to fix code or fix the method description and UDF wiki
> Let me know what you think?
> {code}
> str_to_map("a=1,b=2")
> vs
> str_to_map("a:1,b:2")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-1010:
-
Labels: TODOC3.0 pull-request-available  (was: TODOC3.0)

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
>Priority: Major
>  Labels: TODOC3.0, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-1010.10.patch, HIVE-1010.11.patch, 
> HIVE-1010.12.patch, HIVE-1010.13.patch, HIVE-1010.14.patch, 
> HIVE-1010.15.patch, HIVE-1010.16.patch, HIVE-1010.7.patch, HIVE-1010.8.patch, 
> HIVE-1010.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15424) Hive dropped table during table creation if table already exists

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15424:
--
Labels: pull-request-available  (was: )

> Hive dropped table during table creation if table already exists
> 
>
> Key: HIVE-15424
> URL: https://issues.apache.org/jira/browse/HIVE-15424
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Suresh Bahuguna
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While creating a table, rollbackCreateTable() shouldn't be called if table 
> already exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-16925) isSlowStart lost during refactoring

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-16925?focusedWorklogId=446728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446728
 ]

ASF GitHub Bot logged work on HIVE-16925:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #195:
URL: https://github.com/apache/hive/pull/195#issuecomment-644888141


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446728)
Remaining Estimate: 0h
Time Spent: 10m

> isSlowStart lost during refactoring
> ---
>
> Key: HIVE-16925
> URL: https://issues.apache.org/jira/browse/HIVE-16925
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Priority: Minor
> Attachments: HIVE-16925.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TezEdgeProperty.setAutoReduce() should have isSlowStart as parameter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15424) Hive dropped table during table creation if table already exists

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15424?focusedWorklogId=446717=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446717
 ]

ASF GitHub Bot logged work on HIVE-15424:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #123:
URL: https://github.com/apache/hive/pull/123#issuecomment-644888295


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446717)
Remaining Estimate: 0h
Time Spent: 10m

> Hive dropped table during table creation if table already exists
> 
>
> Key: HIVE-15424
> URL: https://issues.apache.org/jira/browse/HIVE-15424
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Suresh Bahuguna
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> While creating a table, rollbackCreateTable() shouldn't be called if table 
> already exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15442) Driver.java has a redundancy code

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15442?focusedWorklogId=446724=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446724
 ]

ASF GitHub Bot logged work on HIVE-15442:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #169:
URL: https://github.com/apache/hive/pull/169#issuecomment-644888190


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446724)
Remaining Estimate: 0h
Time Spent: 10m

> Driver.java has a redundancy  code
> --
>
> Key: HIVE-15442
> URL: https://issues.apache.org/jira/browse/HIVE-15442
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15442.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Driver.java has a  redundancy  code about "explain output", i think the if 
> statement " if (conf.getBoolVar(ConfVars.HIVE_LOG_EXPLAIN_OUTPUT))" has a 
> repeat judge with the above statement. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-17260) Typo: exception has been created and lost in the ThriftJDBCBinarySerDe

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17260:
--
Labels: pull-request-available  (was: )

> Typo: exception has been created and lost in the ThriftJDBCBinarySerDe
> --
>
> Key: HIVE-17260
> URL: https://issues.apache.org/jira/browse/HIVE-17260
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleg Danilov
>Assignee: Oleg Danilov
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-17260.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Line 100:
> {code:java}
> } catch (Exception e) {
>   new SerDeException(e);
> }
> {code}
> Seems like it should be thrown there :-)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21063) Support statistics in cachedStore for transactional table

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21063?focusedWorklogId=446730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446730
 ]

ASF GitHub Bot logged work on HIVE-21063:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #514:
URL: https://github.com/apache/hive/pull/514


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446730)
Time Spent: 20m  (was: 10m)

> Support statistics in cachedStore for transactional table
> -
>
> Key: HIVE-21063
> URL: https://issues.apache.org/jira/browse/HIVE-21063
> Project: Hive
>  Issue Type: Task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21063.01.patch, HIVE-21063.02.patch, 
> HIVE-21063.03.patch, HIVE-21063.04.patch, HIVE-21063.05.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently statistics for transactional table is not stored in cached store 
> for consistency issues. Need to add validation for valid write ids and 
> generation of aggregate stats based on valid partitions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15497) Unthrown SerDeException in ThriftJDBCBinarySerDe.java

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15497:
--
Labels: pull-request-available  (was: )

> Unthrown SerDeException in ThriftJDBCBinarySerDe.java
> -
>
> Key: HIVE-15497
> URL: https://issues.apache.org/jira/browse/HIVE-15497
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: JC
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-15497.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is an unthrown SerDeException in 
> serde/src/java/org/apache/hadoop/hive/serde2/thrift/ThriftJDBCBinarySerDe.java
>  (found in the currenet github snapshot, 
> 4ba713ccd85c3706d195aeef9476e6e6363f1c21)
> {code}
>  91 initializeRowAndColumns();
>  92 try {
>  93   thriftFormatter.initialize(conf, tbl);
>  94 } catch (Exception e) {
>  95   new SerDeException(e);
>  96 }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15423) Allowing Hive to reverse map IP from hostname for partition info

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15423:
--
Labels: pull-request-available  (was: )

> Allowing Hive to reverse map IP from hostname for partition info
> 
>
> Key: HIVE-15423
> URL: https://issues.apache.org/jira/browse/HIVE-15423
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Suresh Bahuguna
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive - Namenode hostname mismatch when running queries with 2 MR jobs.
> Hive tries to find Partition info using hdfs://:, 
> whereas the info has been hashed using hdfs://:.
> Exception raised in HiveFileFormatUtils.java:
> -
> java.io.IOException: cannot find dir = 
> hdfs://hd-nn-24:9000/tmp/hive-admin/hive_2013-08-30_06-11-52_007_1545561832334194535/-mr-10002/00_0
>  in pathToPartitionInfo: 
> [hdfs://192.168.156.24:9000/tmp/hive-admin/hive_2013-08-30_06-11-52_007_1545561832334194535/-mr-10002]
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java
> -



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-16497:
--
Labels: pull-request-available  (was: )

> FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file 
> system operations should be impersonated
> --
>
> Key: HIVE-16497
> URL: https://issues.apache.org/jira/browse/HIVE-16497
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas Nair
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> FileUtils.isActionPermittedForFileHierarchy checks if user has permissions 
> for given action. The checks are made by impersonating the user.
> However, the listing of child dirs are done as the hiveserver2 user. If the 
> hive user doesn't have permissions on the filesystem, it gives incorrect 
> error that the user doesn't have permissions to perform the action.
> Impersonating the end user for all file operations in that function is also 
> logically correct thing to do.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-16645) Commands.java has missed the catch statement and has some code format errors

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-16645?focusedWorklogId=446718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446718
 ]

ASF GitHub Bot logged work on HIVE-16645:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #183:
URL: https://github.com/apache/hive/pull/183#issuecomment-644888155


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446718)
Remaining Estimate: 0h
Time Spent: 10m

> Commands.java has missed the catch statement and has some code format errors
> 
>
> Key: HIVE-16645
> URL: https://issues.apache.org/jira/browse/HIVE-16645
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16645.1.patch, HIVE-16645.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In commands.java, the catch statement is missing and the Resultset statement 
> is not closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21041) NPE, ParseException in getting schema from logical plan

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21041?focusedWorklogId=446723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446723
 ]

ASF GitHub Bot logged work on HIVE-21041:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #507:
URL: https://github.com/apache/hive/pull/507


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446723)
Time Spent: 20m  (was: 10m)

> NPE, ParseException in getting schema from logical plan
> ---
>
> Key: HIVE-21041
> URL: https://issues.apache.org/jira/browse/HIVE-21041
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21041.2.patch, HIVE-21041.3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-20552 makes getting schema from logical plan faster. But it throws 
> ParseException when it has column alias, and NullPointerException when it has 
> subqueries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-15497) Unthrown SerDeException in ThriftJDBCBinarySerDe.java

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15497?focusedWorklogId=446716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446716
 ]

ASF GitHub Bot logged work on HIVE-15497:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #126:
URL: https://github.com/apache/hive/pull/126#issuecomment-644888284


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446716)
Remaining Estimate: 0h
Time Spent: 10m

> Unthrown SerDeException in ThriftJDBCBinarySerDe.java
> -
>
> Key: HIVE-15497
> URL: https://issues.apache.org/jira/browse/HIVE-15497
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: JC
>Priority: Trivial
> Attachments: HIVE-15497.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is an unthrown SerDeException in 
> serde/src/java/org/apache/hadoop/hive/serde2/thrift/ThriftJDBCBinarySerDe.java
>  (found in the currenet github snapshot, 
> 4ba713ccd85c3706d195aeef9476e6e6363f1c21)
> {code}
>  91 initializeRowAndColumns();
>  92 try {
>  93   thriftFormatter.initialize(conf, tbl);
>  94 } catch (Exception e) {
>  95   new SerDeException(e);
>  96 }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-15900) Beeline prints tez job progress in stdout instead of stderr

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-15900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-15900:
--
Labels: pull-request-available  (was: )

> Beeline prints tez job progress in stdout instead of stderr
> ---
>
> Key: HIVE-15900
> URL: https://issues.apache.org/jira/browse/HIVE-15900
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.2.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Thejas Nair
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.2.0
>
> Attachments: HIVE-15900.1.patch, HIVE-15900.2.patch, 
> HIVE-15900.3.patch, HIVE-15900.3.patch, std_out
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Tez job progress messages are getting updated to stdout instead of stderr
> Attaching output file for below command, with the tez job status printed
> $HIVE_HOME/bin/beeline -n  -p  -u " --outputformat=tsv -e "analyze table studenttab10k compute statistics;" > 
> stdout



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-17077) Hive should raise StringIndexOutOfBoundsException when LPAD/RPAD len character's value is negative number

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17077:
--
Labels: pull-request-available  (was: )

> Hive should raise StringIndexOutOfBoundsException when LPAD/RPAD len 
> character's value is negative number
> -
>
> Key: HIVE-17077
> URL: https://issues.apache.org/jira/browse/HIVE-17077
> Project: Hive
>  Issue Type: Bug
>Reporter: Lingang Deng
>Assignee: Lingang Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> lpad(rpad) throw a exception when the second argument a negative number, as 
> follows,
> {code:java}
> hive> select lpad("hello", -1 ,"h");
> FAILED: StringIndexOutOfBoundsException String index out of range: -1
> hive> select rpad("hello", -1 ,"h");
> FAILED: StringIndexOutOfBoundsException String index out of range: -1
> {code}
> Maybe we should return friendly result such as mysql.
> {code:java}
> mysql> select lpad("hello", -1 ,"h");
> +--+
> | lpad("hello", -1 ,"h") |
> +--+
> | NULL |
> +--+
> 1 row in set (0.00 sec)
> mysql> select rpad("hello", -1 ,"h");
> +--+
> | rpad("hello", -1 ,"h") |
> +--+
> | NULL |
> +--+
> 1 row in set (0.00 sec)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-1010) Implement INFORMATION_SCHEMA in Hive

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-1010?focusedWorklogId=446715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446715
 ]

ASF GitHub Bot logged work on HIVE-1010:


Author: ASF GitHub Bot
Created on: 16/Jun/20 16:56
Start Date: 16/Jun/20 16:56
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #181:
URL: https://github.com/apache/hive/pull/181#issuecomment-644888167


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446715)
Remaining Estimate: 0h
Time Spent: 10m

> Implement INFORMATION_SCHEMA in Hive
> 
>
> Key: HIVE-1010
> URL: https://issues.apache.org/jira/browse/HIVE-1010
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore, Query Processor, Server Infrastructure
>Reporter: Jeff Hammerbacher
>Assignee: Gunther Hagleitner
>Priority: Major
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-1010.10.patch, HIVE-1010.11.patch, 
> HIVE-1010.12.patch, HIVE-1010.13.patch, HIVE-1010.14.patch, 
> HIVE-1010.15.patch, HIVE-1010.16.patch, HIVE-1010.7.patch, HIVE-1010.8.patch, 
> HIVE-1010.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> INFORMATION_SCHEMA is part of the SQL92 standard and would be useful to 
> implement using our metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21091) Arrow serializer sets null at wrong index

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21091?focusedWorklogId=446714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446714
 ]

ASF GitHub Bot logged work on HIVE-21091:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #517:
URL: https://github.com/apache/hive/pull/517


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446714)
Time Spent: 20m  (was: 10m)

> Arrow serializer sets null at wrong index
> -
>
> Key: HIVE-21091
> URL: https://issues.apache.org/jira/browse/HIVE-21091
> Project: Hive
>  Issue Type: Bug
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21091.1.patch, HIVE-21091.2.patch, 
> HIVE-21091.3.patch, HIVE-21091.3.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Arrow serializer sets null at wrong index



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-17077) Hive should raise StringIndexOutOfBoundsException when LPAD/RPAD len character's value is negative number

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17077?focusedWorklogId=446713=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446713
 ]

ASF GitHub Bot logged work on HIVE-17077:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #203:
URL: https://github.com/apache/hive/pull/203#issuecomment-644888117


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446713)
Remaining Estimate: 0h
Time Spent: 10m

> Hive should raise StringIndexOutOfBoundsException when LPAD/RPAD len 
> character's value is negative number
> -
>
> Key: HIVE-17077
> URL: https://issues.apache.org/jira/browse/HIVE-17077
> Project: Hive
>  Issue Type: Bug
>Reporter: Lingang Deng
>Assignee: Lingang Deng
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> lpad(rpad) throw a exception when the second argument a negative number, as 
> follows,
> {code:java}
> hive> select lpad("hello", -1 ,"h");
> FAILED: StringIndexOutOfBoundsException String index out of range: -1
> hive> select rpad("hello", -1 ,"h");
> FAILED: StringIndexOutOfBoundsException String index out of range: -1
> {code}
> Maybe we should return friendly result such as mysql.
> {code:java}
> mysql> select lpad("hello", -1 ,"h");
> +--+
> | lpad("hello", -1 ,"h") |
> +--+
> | NULL |
> +--+
> 1 row in set (0.00 sec)
> mysql> select rpad("hello", -1 ,"h");
> +--+
> | rpad("hello", -1 ,"h") |
> +--+
> | NULL |
> +--+
> 1 row in set (0.00 sec)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19661?focusedWorklogId=446712=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446712
 ]

ASF GitHub Bot logged work on HIVE-19661:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #362:
URL: https://github.com/apache/hive/pull/362#issuecomment-644886505


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446712)
Remaining Estimate: 0h
Time Spent: 10m

> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19661.01.patch, HIVE-19661.02.patch, 
> HIVE-19661.03.patch, HIVE-19661.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19488) Enable CM root based on db parameter, identifying a db as source of replication.

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19488?focusedWorklogId=446703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446703
 ]

ASF GitHub Bot logged work on HIVE-19488:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #345:
URL: https://github.com/apache/hive/pull/345#issuecomment-644886610


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446703)
Remaining Estimate: 0h
Time Spent: 10m

> Enable CM root based on db parameter, identifying a db as source of 
> replication.
> 
>
> Key: HIVE-19488
> URL: https://issues.apache.org/jira/browse/HIVE-19488
> Project: Hive
>  Issue Type: Task
>  Components: Hive, HiveServer2, repl
>Affects Versions: 3.1.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19488.01.patch, HIVE-19488.02.patch, 
> HIVE-19488.03.patch, HIVE-19488.04.patch, HIVE-19488.05.patch, 
> HIVE-19488.06.patch, HIVE-19488.07.patch, HIVE-19488.08-branch-3.patch, 
> HIVE-19488.08.patch, HIVE-19488.09-branch-3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> * add a parameter at db level to identify if its a source of replication. 
> user should set this.
>  * Enable CM root only for databases that are a source of a replication 
> policy, for other db's skip the CM root functionality.
>  * prevent database drop if the parameter indicating its source of a 
> replication, is set.
>  * as an upgrade to this version, user should set the property on all 
> existing database policies, in affect.
>  * the parameter should be of the form . –  repl.source.for : List < policy 
> ids >



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-19829) Incremental replication load should create tasks in execution phase rather than semantic phase

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19829?focusedWorklogId=446707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446707
 ]

ASF GitHub Bot logged work on HIVE-19829:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #370:
URL: https://github.com/apache/hive/pull/370#issuecomment-644886436


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446707)
Remaining Estimate: 0h
Time Spent: 10m

> Incremental replication load should create tasks in execution phase rather 
> than semantic phase
> --
>
> Key: HIVE-19829
> URL: https://issues.apache.org/jira/browse/HIVE-19829
> Project: Hive
>  Issue Type: Task
>  Components: repl
>Affects Versions: 3.1.0, 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-19829.01.patch, HIVE-19829.02.patch, 
> HIVE-19829.03.patch, HIVE-19829.04.patch, HIVE-19829.06.patch, 
> HIVE-19829.07.patch, HIVE-19829.07.patch, HIVE-19829.08-branch-3.patch, 
> HIVE-19829.08.patch, HIVE-19829.09.patch, HIVE-19829.10-branch-3.patch, 
> HIVE-19829.10.patch, HIVE-19829.11-branch-3.patch, 
> HIVE-19829.12-branch-3.patch, HIVE-19829.13-branch-3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Split the incremental load into multiple iterations. In each iteration create 
> number of tasks equal to the configured value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-20283) Logs may be directed to 2 files if --hiveconf hive.log.file is used (metastore)

2020-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20283?focusedWorklogId=446709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-446709
 ]

ASF GitHub Bot logged work on HIVE-20283:
-

Author: ASF GitHub Bot
Created on: 16/Jun/20 16:55
Start Date: 16/Jun/20 16:55
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #409:
URL: https://github.com/apache/hive/pull/409


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 446709)
Time Spent: 20m  (was: 10m)

> Logs may be directed to 2 files if --hiveconf hive.log.file is used 
> (metastore)
> ---
>
> Key: HIVE-20283
> URL: https://issues.apache.org/jira/browse/HIVE-20283
> Project: Hive
>  Issue Type: Bug
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20283.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Unfortunately when doing this : 
> https://issues.apache.org/jira/browse/HIVE-19886 I forgot to do it as well 
> for the metastore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >