[jira] [Created] (HIVE-26236) count(1) with subquery count(distinct) gives wrong results with hive.optimize.distinct.rewrite=true and cbo on

2022-05-17 Thread honghui.Liu (Jira)
honghui.Liu created HIVE-26236:
--

 Summary: count(1) with subquery count(distinct) gives wrong 
results with hive.optimize.distinct.rewrite=true and cbo on
 Key: HIVE-26236
 URL: https://issues.apache.org/jira/browse/HIVE-26236
 Project: Hive
  Issue Type: Bug
  Components: CBO, Logical Optimizer
Affects Versions: All Versions
Reporter: honghui.Liu
Assignee: honghui.Liu


{code:java}
create table count_distinct(a int, b int);
insert into table count_distinct values (1,2),(2,3);
set hive.execution.engine=tez;
set hive.cbo.enable=true;
set hive.optimize.distinct.rewrite=true;
select count(1) from ( 
      select count(distinct a) from count_distinct
) tmp; {code}
it give wrong result when hive.optimize.distinct.rewrite is true, By default, 
it's true for all 3.x versions. The test result is 2, and the expected result 
is 1.

Before CBO optimization,RelNode tree as this,

 
{code:java}
HiveProject(_o__c0=[$0])
  HiveAggregate(group=[{}], agg#0=[count($0)])
    HiveProject($f0=[1])
      HiveProject(_o__c0=[$0])
        HiveAggregate(group=[{}], agg#0=[count(DISTINCT $0)])
          HiveProject($f0=[$0])
            HiveTableScan(table=[[default.count_distinct]], 
table:alias=[count_distinct]) {code}
 

Optimized by HiveExpandDistinctAggregatesRule, RelNode tree as this,

 
{code:java}
HiveProject(_o__c0=[$0])
  HiveAggregate(group=[{}], agg#0=[count($0)])
    HiveProject($f0=[1])
      HiveProject(_o__c0=[$0])
        HiveAggregate(group=[{}], agg#0=[count($0)])
          HiveAggregate(group=[{0}])
            HiveProject($f0=[$0])
              HiveProject($f0=[$0])
                HiveTableScan(table=[[default.count_distinct]], 
table:alias=[count_distinct]) {code}
count(distinct xx) converte to count (xx) from (select xx from table_name group 
by xx) 

 

Optimized by Projection Pruning, RelNode tree as this, 
{code:java}
HiveAggregate(group=[{}], agg#0=[count()])
  HiveProject(DUMMY=[0])
    HiveAggregate(group=[{}])
      HiveAggregate(group=[{0}])
        HiveProject(a=[$0])
          HiveTableScan(table=[[default.count_distinct]], 
table:alias=[count_distinct]) {code}
In this case, an error occurs in the execution plan.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26235) OR Condition on binary column is returning empty result

2022-05-17 Thread Naresh P R (Jira)
Naresh P R created HIVE-26235:
-

 Summary: OR Condition on binary column is returning empty result
 Key: HIVE-26235
 URL: https://issues.apache.org/jira/browse/HIVE-26235
 Project: Hive
  Issue Type: Bug
Reporter: Naresh P R


Repro steps
{code:java}
create table test_binary(data_col timestamp, binary_col binary) partitioned by 
(ts string);
insert into test_binary partition(ts='20220420') values ('2022-04-20 
00:00:00.0', 'a'),('2022-04-20 00:00:00.0', 'b'), ('2022-04-20 00:00:00.0', 
'c');
// Works
select * from test_binary where ts='20220420' and binary_col = unhex('61');
select * from test_binary where ts='20220420' and binary_col between 
unhex('61') and unhex('62');
//Returns empty result
select * from test_binary where binary_col = unhex('61') or binary_col = 
unhex('62');
select * from test_binary where ts='20220420' and (binary_col = unhex('61') 
or binary_col = unhex('62'));
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26234) Update guava to 30.1.1-jre

2022-05-17 Thread Hemanth Boyina (Jira)
Hemanth Boyina created HIVE-26234:
-

 Summary: Update guava to 30.1.1-jre
 Key: HIVE-26234
 URL: https://issues.apache.org/jira/browse/HIVE-26234
 Project: Hive
  Issue Type: Improvement
Reporter: Hemanth Boyina
Assignee: Hemanth Boyina


Update guava to 30.1.1-jre



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26233) Problems reading back PARQUET timestamps above 10000 years

2022-05-17 Thread Peter Vary (Jira)
Peter Vary created HIVE-26233:
-

 Summary: Problems reading back PARQUET timestamps above 1 years
 Key: HIVE-26233
 URL: https://issues.apache.org/jira/browse/HIVE-26233
 Project: Hive
  Issue Type: Bug
Reporter: Peter Vary
Assignee: Peter Vary


Timestamp values above year 1 are not supported, but during the migration 
from Hive2 to Hive3 some might appear because of TZ issues. We should be able 
to at least read these tables before rewriting the data.

For this we need to change the Timestamp.PRINT_FORMATTER, so no {{+}} sign is 
appended to the timestamp if the year exceeds 4 digits.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26232) AcidUtils getLogicalLength shouldn't be called for external tables

2022-05-17 Thread Nikhil Gupta (Jira)
Nikhil Gupta created HIVE-26232:
---

 Summary: AcidUtils getLogicalLength shouldn't be called for 
external tables
 Key: HIVE-26232
 URL: https://issues.apache.org/jira/browse/HIVE-26232
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.2
Reporter: Nikhil Gupta






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HIVE-26231) Generate insert notification events when dynamic partition insert is done on existing partitions

2022-05-17 Thread Sourabh Badhya (Jira)
Sourabh Badhya created HIVE-26231:
-

 Summary: Generate insert notification events when dynamic 
partition insert is done on existing partitions
 Key: HIVE-26231
 URL: https://issues.apache.org/jira/browse/HIVE-26231
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Sourabh Badhya
Assignee: Sourabh Badhya






--
This message was sent by Atlassian Jira
(v8.20.7#820007)