[ 
https://issues.apache.org/jira/browse/HIVE-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321959#comment-16321959
 ] 

Zoltan Haindrich commented on HIVE-18359:
-----------------------------------------

I was thinking about a similar situation when I was working on HIVE-17617; IIRC 
{{hasOutput}} also somehow "detects" if the mappers are not optimized away...
It would be better to have the condition for emitting the summary row like: 
{{firstReducer() && mapperIsAbsent()}}

I would like to note some more things; which might be relevant:

* only Tez tends to silently remove a mapper completely without notices
** an alternative solution would be to somehow convince tez; to not remove the 
mapper completely in case an empty groupping set is present;  because it might 
create a record out of thin air (this is what "basically" causes all these 
problems)... I've seen no way to configure this - so I've looked for other 
options...
* under the hood it is possible to detect whenever the mapper is optimized 
away; by checking for the messagecount to be 0 - IIRC that part of tez was not 
publicly accessible

I will take a look and probably try to come up with a testcase which checks the 
multi reducer case

> Extend grouping set limits from int to long
> -------------------------------------------
>
>                 Key: HIVE-18359
>                 URL: https://issues.apache.org/jira/browse/HIVE-18359
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-18359.1.patch, HIVE-18359.2.patch, 
> HIVE-18359.3.patch, HIVE-18359.4.patch, HIVE-18359.5.patch
>
>
> Grouping sets is broken for >32 columns because of usage of Int for bitmap 
> (also GROUPING__ID virtual column). This assumption breaks grouping 
> sets/rollups/cube when number of participating aggregation columns is >32. 
> The easier fix would be extend it to Long for now. The correct fix would be 
> to use BitSets everywhere but that would require GROUPING__ID column type to 
> binary which will make predicates on GROUPING__ID difficult to deal with. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to