That error had been thrown in cases where a preceding phase of the Hive plan produced no results (yet it did find data to scan) which in your case would be the filter conditions ocurring in the first of 2 MR -- r u certain there r records which would match your day and game_id conditions? I can't validate right now whether that behavior has been made more user friendly.
On Jul 8, 2009 5:10 PM, "RSD" <[email protected]> wrote: hive> describe game_start; recordtime string user_id int session_id string host string release string source string ip string ip_country string game_id int challenge int suggestion_rank int language string day string hive> select a.game_id, count(distinct(a.user_id)), count(1) from game_start a where a.day >= '2009-06-01' and a.day < '2009-06-03' and (a.game_id = 501 or a.game_id = 502 or a.game_id = 504 or a.game_id = 505 or a.game_id = 563) group by a.game_id; Total MapReduce jobs = 2 Starting Job = job_200907071612_0568, Tracking URL = http://somehost:50030/jobdetails.jsp?jobid=job_200907071612_0568 Kill Command = /home/analytics/hadoop/dist/current/bin/hadoop job -Dmapred.job.tracker=somehost:9001 -kill job_200907071612_0568 ... Ended Job = job_200907071612_0568 Job Submission failed with exception 'Input path doesnt exist : hdfs://somehost:9000/tmp/hive-someuser/183603784.10002' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ExecDriver there is a local directory /tmp/hive-someuser (and in /tmp/someuser/hive.log is where i am logging)... is there something wrong with the syntax of the query?
