[ 
https://issues.apache.org/jira/browse/HIVE-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198864#comment-16198864
 ] 

Zoltan Haindrich commented on HIVE-17617:
-----------------------------------------

about how it worked earlier:

* in case of a simple {{select count(1) from x}} there is the implict {{()}} 
grouping.. in which case only 1 reducer is spawned ... I don't think it would 
make sense to spawn any more than one.
    ** the summary row was served by the Reducer based on that there were no 
inputrows and it have been closed and there were no grouping keys.
* in case grouping sets: earlier when there were at least one input row which 
made thru the Mapper; at the output it emitted 1 row for each grouping set
     ** if the () set was present; there were a grouping which collected those 
- and it just worked 

however in case of grouping sets; it is possible that multiple reducers can 
effectively split up the work... even in a simple case when there is one 
grouping field.

I'm afraid setting {{numReducers=1}} would possibly add some performance 
penalties; I will peek into the code - and try to set it only if the empty 
grouping set is present.


> Rollup of an empty resultset should contain the grouping of the empty 
> grouping set
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-17617
>                 URL: https://issues.apache.org/jira/browse/HIVE-17617
>             Project: Hive
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>         Attachments: HIVE-17617.01.patch, HIVE-17617.03.patch, 
> HIVE-17617.04.patch
>
>
> running
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer,c integer);
> select  sum(c),
>         grouping(b)
> from    tx1
> group by rollup (b);
> {code}
> returns 0 rows; however 
> according to the standard:
> The <empty grouping set> is regarded as the shortest such initial sublist. 
> For example, “ROLLUP ( (A, B), (C, D) )”
> is equivalent to “GROUPING SETS ( (A, B, C, D), (A, B), () )”.
> so I think the totals row (the grouping for {{()}} should be present)  - psql 
> returns it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to