Sorry for the delay. Here is from my /tmp/root/hive.log file. Any other files I should be looking into.
2010-02-18 00:29:56,082 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(580)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2010-02-18 00:30:39,506 ERROR exec.ExecDriver (SessionState.java:printError(279)) - Ended Job = job_201002171050_0011 with errors 2010-02-18 00:30:39,514 ERROR ql.Driver (SessionState.java:printError(279)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver On Wed, Feb 17, 2010 at 6:36 PM, Sonal Goyal <[email protected]> wrote: > Hi, > > What do your Hive logs say? You can also check the Hadoop mapper and reduce > job logs. > > Thanks and Regards, > Sonal > > > > On Wed, Feb 17, 2010 at 4:18 PM, prasenjit mukherjee <[email protected] > > wrote: > >> >> Here is my std-error : >> hive> insert overwrite local directory '/tmp/mystuff' select transform(*) >> using 'my.py' FROM myhivetable; >> Total MapReduce jobs = 1 >> Number of reduce tasks is set to 0 since there's no reduce operator >> Starting Job = job_201002160457_0033, Tracking URL = >> http://ec2-204-236-205-98.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201002160457_0033 >> Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker= >> ec2-204-236-205-98.compute-1.amazonaws.com:8021 -kill >> job_201002160457_0033 >> 2010-02-17 05:40:28,380 map = 0%, reduce =0% >> 2010-02-17 05:41:12,469 map = 100%, reduce =100% >> Ended Job = job_201002160457_0033 with errors >> FAILED: Execution Error, return code 2 from >> org.apache.hadoop.hive.ql.exec.ExecDriver >> >> >> I am trying to use the following command : >> >> hive ql : >> >> add file /root/my.py >> insert overwrite local directory '/tmp/mystuff' select transform(*) using >> 'my.py' FROM myhivetable; >> >> and following is my my.py: >> #!/usr/bin/python >> import sys >> for line in sys.stdin: >> line = line.strip() >> flds = line.split('\t') >> (cl_id,cook_id)=flds[:2] >> sub_id=cl_id >> if cl_id.startswith('foo'): sub_id=cook_id; >> print ','.join([sub_id,flds[2],flds[3]]) >> >> This works fine, as I tested it in commandline using : echo -e >> 'aa\tbb\tcc\tdd' | /root/my.py >> >> Any pointers ? >> > >
