grouping sets for a high number of grouping set keys

Irwin (JIRA) Wed, 26 Jun 2013 18:50:36 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694419#comment-13694419
 ]


Irwin commented on HIVE-3552:
-----------------------------

I have tested for cubes and rollups, but failed.
My table is:t1,formatted followes:
CREATE TABLE T1(a STRING, b STRING, c STRING) ROW FORMAT DELIMITED FIELDS 
TERMINATED BY ' ' STORED AS TEXTFILE; 
Datas:
a1 b1 1
a2 b2 2
a3 b3 3
a4 b4 4
a5 b5 5
a6 b6 6
a7 b7 7
a1 b1 2
a1 b1 3
a2 b2 1
a2 b2 5
The sql is:
hive> SELECT a,b,count(1) from t1 GROUP BY a,b WITH CUBE;
The error message is:
FAILED: Parse Error: line 1:37 cannot recognize input near 'b' 'WITH' 'CUBE' in 
expression specification
I have tried to use hive-0.10.0 and hive-0.11.0, and the error message is same.
Why I cannot use Enhanced Aggregation, Cube, Grouping and Rollup?
Any one help? Thanks!

                
> HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a 
> high number of grouping set keys
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3552
>                 URL: https://issues.apache.org/jira/browse/HIVE-3552
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>             Fix For: 0.11.0
>
>         Attachments: hive.3552.10.patch, hive.3552.11.patch, 
> hive.3552.12.patch, hive.3552.1.patch, hive.3552.2.patch, hive.3552.3.patch, 
> hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, 
> hive.3552.8.patch, hive.3552.9.patch
>
>
> This is a follow up for HIVE-3433.
> Had a offline discussion with Sambavi - she pointed out a scenario where the
> implementation in HIVE-3433 will not scale. Assume that the user is performing
> a cube on many columns, say '8' columns. So, each row would generate 256 rows
> for the hash table, which may kill the current group by implementation.
> A better implementation would be to add an additional mr job - in the first 
> mr job perform the group by assuming there was no cube. Add another mr job, 
> where
> you would perform the cube. The assumption is that the group by would have 
> decreased the output data significantly, and the rows would appear in the 
> order of
> grouping keys which has a higher probability of hitting the hash table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

Reply via email to