[
https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878542#comment-13878542
]
Lefty Leverenz commented on HIVE-3552:
--------------------------------------
This adds hive.new.job.grouping.set.cardinality to HiveConf.java and
hive-default.xml.template.
Also documented in the wiki, with a link to this JIRA ticket: [Query Execution
|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryExecution]
(search for "grouping").
> HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a
> high number of grouping set keys
> -------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-3552
> URL: https://issues.apache.org/jira/browse/HIVE-3552
> Project: Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Namit Jain
> Assignee: Namit Jain
> Fix For: 0.11.0
>
> Attachments: hive.3552.1.patch, hive.3552.10.patch,
> hive.3552.11.patch, hive.3552.12.patch, hive.3552.2.patch, hive.3552.3.patch,
> hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch,
> hive.3552.8.patch, hive.3552.9.patch
>
>
> This is a follow up for HIVE-3433.
> Had a offline discussion with Sambavi - she pointed out a scenario where the
> implementation in HIVE-3433 will not scale. Assume that the user is performing
> a cube on many columns, say '8' columns. So, each row would generate 256 rows
> for the hash table, which may kill the current group by implementation.
> A better implementation would be to add an additional mr job - in the first
> mr job perform the group by assuming there was no cube. Add another mr job,
> where
> you would perform the cube. The assumption is that the group by would have
> decreased the output data significantly, and the rows would appear in the
> order of
> grouping keys which has a higher probability of hitting the hash table.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)