We tried to run the test case that you gave in your email and it seems to work fine. I am running this on r793646. Can you try with that?
Thanks, Ashish ________________________________ From: Eva Tse [mailto:[email protected]] Sent: Friday, July 17, 2009 3:33 PM To: [email protected] Subject: Re: Error in running group-by and join hive query... Ashish, it is in the attached file. Thanks, Eva. On 7/17/09 1:27 PM, "Ashish Thusoo" <[email protected]> wrote: Looks like pathToPartitionInfo array did not get populated in your case. Can you also send the output of explain extended <query> That will tell us the value of pathToPartitionInfo. Ashish ________________________________ From: Eva Tse [mailto:[email protected]] Sent: Friday, July 17, 2009 12:24 PM To: [email protected] Subject: Re: Error in running group-by and join hive query... I believe this is the relevant section. Please let me know if we need add'l info. Thanks, Eva. 2009-07-17 13:59:30,953 ERROR ql.Driver (SessionState.java:printError(279)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ExecDriver 2009-07-17 15:16:00,718 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved. 2009-07-17 15:16:00,718 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved. 2009-07-17 15:16:00,718 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved. 2009-07-17 15:16:00,722 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved. 2009-07-17 15:16:00,722 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved. 2009-07-17 15:16:00,722 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved. 2009-07-17 15:16:00,723 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved. 2009-07-17 15:16:00,723 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved. 2009-07-17 15:16:00,723 ERROR JPOX.Plugin (Log4JLogger.java:error(117)) - Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved. 2009-07-17 15:16:05,605 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2009-07-17 15:16:05,814 ERROR exec.ExecDriver (SessionState.java:printError(279)) - Job Submission failed with exception 'java.io.IOException(cannot find dir = hdfs://ip-10-251-49-188.ec2.internal:9000/tmp/hive-dataeng/1 in partToPartitionInfo!)' java.io.IOException: cannot find dir = hdfs://ip-10-251-49-188.ec2.internal:9000/tmp/hive-dataeng/1 in partToPartitionInfo! at org.apache.hadoop.hive.ql.io.HiveInputFormat.getTableDescFromPath(HiveInputFormat.java:256) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:208) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:387) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:176) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:234) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) 2009-07-17 15:16:05,821 ERROR ql.Driver (SessionState.java:printError(279)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ExecDriver On 7/17/09 12:02 PM, "Ashish Thusoo" <[email protected]> wrote: what does /tmp/<username>/hive.log contain? Ashish ________________________________ From: Eva Tse [mailto:[email protected]] Sent: Friday, July 17, 2009 11:07 AM To: [email protected] Subject: Error in running group-by and join hive query... Hive version: r786648 w/ HIVE-487 2nd patch. However, it is working on Hive 0.3. Thanks, Eva. Running the script in this email gives the following errors: Hive history file=/tmp/dataeng/hive_job_log_dataeng_200907171359_1511035858.txt OK Time taken: 3.419 seconds OK Time taken: 0.211 seconds OK Time taken: 0.364 seconds OK Time taken: 0.104 seconds Total MapReduce jobs = 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Job Submission failed with exception 'java.io.IOException(cannot find dir = hdfs://ip-10-251-49-188.ec2.internal:9000/tmp/hive-dataeng/1 in partToPartitionInfo!)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ExecDriver Script: drop table facts_details; drop table facts; CREATE TABLE facts (xid string, devtype_id int) PARTITIONED by (dateint int, hour int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\004' MAP KEYS TERMINATED BY '\002' stored as SEQUENCEFILE; CREATE TABLE facts_details (xid string, cdn_name string, utc_ms array<bigint>, moff array<int>) PARTITIONED by (dateint int, hour int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\004' MAP KEYS TERMINATED BY '\002' stored as SEQUENCEFILE; select f.devtype_id from facts f join facts_details c on (f.xid = c.xid) where c.dateint = 20090710 and f.dateint = 20090710 group by f.devtype_id;
