[
https://issues.apache.org/jira/browse/HIVE-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198864#comment-16198864
]
Zoltan Haindrich commented on HIVE-17617:
-----------------------------------------
about how it worked earlier:
* in case of a simple {{select count(1) from x}} there is the implict {{()}}
grouping.. in which case only 1 reducer is spawned ... I don't think it would
make sense to spawn any more than one.
** the summary row was served by the Reducer based on that there were no
inputrows and it have been closed and there were no grouping keys.
* in case grouping sets: earlier when there were at least one input row which
made thru the Mapper; at the output it emitted 1 row for each grouping set
** if the () set was present; there were a grouping which collected those
- and it just worked
however in case of grouping sets; it is possible that multiple reducers can
effectively split up the work... even in a simple case when there is one
grouping field.
I'm afraid setting {{numReducers=1}} would possibly add some performance
penalties; I will peek into the code - and try to set it only if the empty
grouping set is present.
> Rollup of an empty resultset should contain the grouping of the empty
> grouping set
> ----------------------------------------------------------------------------------
>
> Key: HIVE-17617
> URL: https://issues.apache.org/jira/browse/HIVE-17617
> Project: Hive
> Issue Type: Sub-task
> Components: SQL
> Reporter: Zoltan Haindrich
> Assignee: Zoltan Haindrich
> Attachments: HIVE-17617.01.patch, HIVE-17617.03.patch,
> HIVE-17617.04.patch
>
>
> running
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer,c integer);
> select sum(c),
> grouping(b)
> from tx1
> group by rollup (b);
> {code}
> returns 0 rows; however
> according to the standard:
> The <empty grouping set> is regarded as the shortest such initial sublist.
> For example, “ROLLUP ( (A, B), (C, D) )”
> is equivalent to “GROUPING SETS ( (A, B, C, D), (A, B), () )”.
> so I think the totals row (the grouping for {{()}} should be present) - psql
> returns it.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)