[ https://issues.apache.org/jira/browse/HIVE-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006995#comment-13006995 ]
Amareshwari Sriramadasu commented on HIVE-2056: ----------------------------------------------- Here is a request from one of our customers: here is a real example of need to have multi group by with 1 M/R. If you look at the query below, we have two aggregates being generated out of single fact table. The 1st aggregate generates unique count by date and the 2nd one generates unique count by date and gender. We have lot of these aggregates to be built. We would like this to be done in 1 M/R job as against three below. Is it possible to do this in Hive? // created two intermediate tables hive> create table test_1 (dt string, bc_cnt bigint); OK Time taken: 9.004 seconds hive> create table test_2 (dt string, gender string, bc_cnt bigint); OK // multi group by in insert statement hive> from fact_table f > insert overwrite table test_1 select dt, count(distinct id) group by dt > insert overwrite table test_2 select dt,gender,count(distinct id) group by dt,gender; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks not specified. Estimated from input data size: 999 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Thanks Sudhish > Generate single MR job for multi groupby query. > ----------------------------------------------- > > Key: HIVE-2056 > URL: https://issues.apache.org/jira/browse/HIVE-2056 > Project: Hive > Issue Type: Improvement > Reporter: Amareshwari Sriramadasu > Assignee: Amareshwari Sriramadasu > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira