[
https://issues.apache.org/jira/browse/HIVE-16924?focusedWorklogId=205068&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-205068
]
ASF GitHub Bot logged work on HIVE-16924:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Feb/19 09:21
Start Date: 27/Feb/19 09:21
Worklog Time Spent: 10m
Work Description: miklosgergely commented on pull request #544:
HIVE-16924 Support distinct in presence of Group By
URL: https://github.com/apache/hive/pull/544#discussion_r260656405
##########
File path:
ql/src/test/results/clientpositive/distinct_groupby_without_cbo.q.out
##########
@@ -0,0 +1,2018 @@
+PREHOOK: query: explain select distinct count(a.value) from src a group by
a.key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@src
+#### A masked pattern was here ####
+POSTHOOK: query: explain select distinct count(a.value) from src a group by
a.key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@src
+#### A masked pattern was here ####
+STAGE DEPENDENCIES:
+ Stage-1 is a root stage
+ Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+ Stage: Stage-1
+ Map Reduce
+ Map Operator Tree:
+ TableScan
+ alias: a
+ Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE
Column stats: COMPLETE
+ Select Operator
+ expressions: key (type: string), value (type: string)
+ outputColumnNames: key, value
+ Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE
Column stats: COMPLETE
+ Group By Operator
+ aggregations: count(value)
+ keys: key (type: string)
+ mode: hash
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 250 Data size: 23750 Basic stats:
COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 250 Data size: 23750 Basic stats:
COMPLETE Column stats: COMPLETE
+ value expressions: _col1 (type: bigint)
+ Execution mode: vectorized
+ Reduce Operator Tree:
+ Group By Operator
+ aggregations: count(VALUE._col0)
+ keys: KEY._col0 (type: string)
+ mode: mergepartial
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 250 Data size: 23750 Basic stats: COMPLETE
Column stats: COMPLETE
+ Select Operator
+ expressions: _col1 (type: bigint)
+ outputColumnNames: _col0
+ Statistics: Num rows: 250 Data size: 2000 Basic stats: COMPLETE
Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 250 Data size: 2000 Basic stats: COMPLETE
Column stats: COMPLETE
+ table:
+ input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+ Stage: Stage-0
+ Fetch Operator
+ limit: -1
+ Processor Tree:
+ ListSink
+
+PREHOOK: query: select distinct count(a.value) from src a group by a.key
+PREHOOK: type: QUERY
+PREHOOK: Input: default@src
+#### A masked pattern was here ####
+POSTHOOK: query: select distinct count(a.value) from src a group by a.key
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@src
+#### A masked pattern was here ####
+3
+1
+2
+2
+2
+1
Review comment:
this is removed, due to the error, for now we don't support distinct with
group by and aggregagte function if cbo is not enabled
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 205068)
Time Spent: 3h 50m (was: 3h 40m)
> Support distinct in presence of Group By
> -----------------------------------------
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
> Issue Type: New Feature
> Components: Query Planning
> Reporter: Carter Shanklin
> Assignee: Miklos Gergely
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch,
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch,
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch,
> HIVE-16924.09.patch, HIVE-16924.10.patch
>
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get :
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the
> same query. Error encountered near token 'c1'
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)