Eugene Koifman created HIVE-7817: ------------------------------------ Summary: distinct/group by don't work on partition columns Key: HIVE-7817 URL: https://issues.apache.org/jira/browse/HIVE-7817 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Eugene Koifman
suppose you have a table like this: {code:sql} CREATE TABLE page_view( viewTime INT, userid BIGINT, page_url STRING, referrer_url STRING, ip STRING COMMENT 'IP Address of the User') COMMENT 'This is the page view table' PARTITIONED BY(dt STRING, country STRING) CLUSTERED BY(userid) INTO 4 BUCKETS {code:sql} Then {code:sql} select distinct dt from page_view; select distinct dt, country from page_view; select dt, country from page_view group by dt, country; {code:sql} all fail with {noformat} Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Job running in-process (local Hadoop) Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2014-08-20 17:26:13,018 Stage-1 map = 0%, reduce = 0% Ended Job = job_local165359429_0013 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec {noformat} but {code:sql} select dt, country, count(*) from page_view group by dt, country; {code:sql} works fine. -- This message was sent by Atlassian JIRA (v6.2#6252)