[jira] [Commented] (HIVE-12744) GROUPING__ID failed to be recognized in multiple insert

Pengcheng Xiong (JIRA) Mon, 28 Dec 2015 15:06:06 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073242#comment-15073242
 ]


Pengcheng Xiong commented on HIVE-12744:
----------------------------------------

[~busyjay], i put a patch here. A simple solution is to set 
hive.multigroupby.singlereducer=false. The problem is like this. So in your 
case, you would like to use multi-insert and each of the insert contains a 
group by. The group by contains grouping sets. By default, Hive turns on 
"HIVEMULTIGROUPBYSINGLEREDUCER" flag. When this flag is on, Hive tries to 
optimize multi group by query to generate single M/R  job plan. In the single 
M/R plan, there is no map-side aggr. However, "Grouping sets aggregations (with 
rollups or cubes) are not allowed if map-side aggregation is turned off. Set 
hive.map.aggr=true if you want to use grouping sets". Thus, you have to choose 
between grouping sets and multi group by optimization. Thus, i would recommend 
turn the optimization off, i.e.,  set hive.multigroupby.singlereducer=false. 
Thanks.

> GROUPING__ID failed to be recognized in multiple insert
> -------------------------------------------------------
>
>                 Key: HIVE-12744
>                 URL: https://issues.apache.org/jira/browse/HIVE-12744
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.2.1
>         Environment: apache hive 1.2.1
> apache hadoop 2.6.2
>            Reporter: Jay Lee
>            Assignee: Pengcheng Xiong
>         Attachments: HIVE-12744.01.patch
>
>
> When using multiple insert with multiple group by, grouping__id will failed 
> to be parse.
> hive> create temporary table testtable3 (id string, name string);
> OK
> Time taken: 1.019 seconds
> hive> create temporary table testtable2 (id string, name string);
> OK
> Time taken: 0.069 seconds
> hive> create temporary table testtable1 (id string, name string);
> OK
> Time taken: 0.066 seconds
> hive> insert into table testtable1 values ("id", "2333");
> ...
> OK
> Time taken: 32.515 seconds
> hive> from testtable1
>     > insert into table testtable2 select
>     >     id, GROUPING__ID
>     > group by id, name with cube;
> ...
> OK
> Time taken: 42.032 seconds
> hive> from testtable1
>     > insert into table testtable2 select
>     >     id, GROUPING__ID
>     > group by id, name with cube
>     > insert into table testtable3 select
>     >     id, name
>     > group by id, name grouping sets ((id), (id, name));
> FAILED: SemanticException [Error 10025]: Line 3:8 Expression not in GROUP BY 
> key 'GROUPING__ID'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12744) GROUPING__ID failed to be recognized in multiple insert

Reply via email to