You have to add the file to the query like in the example http://wiki.apache.org/hadoop/Hive/GettingStarted
look at the part in red. CREATE TABLE u_data_new ( userid INT, movieid INT, rating INT, weekday INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; add FILE weekday_mapper.py; INSERT OVERWRITE TABLE u_data_new SELECT TRANSFORM (userid, movieid, rating, unixtime) USING 'python weekday_mapper.py' AS (userid, movieid, rating, weekday) FROM u_data; SELECT weekday, COUNT(*) FROM u_data_new GROUP BY weekday; 2011/2/28 Jianhua Wang <[email protected]> > Hi all, > > Recently, i have met a problem, and i can not solve it after some > efforts. So I wanna look for help here, and any help will be appreciated. > Thanks! > > My case is depicted as below: > > I want to execute the HiveQL command : > > select transform(a.col) using '/home/pc/mypython.py' as (col string) from > tmp_table a where a.col2='01'; > > where the 'mypython.py' is a python script of mine. > > I have built a environment of hadoop within the vmware machine on my single > node PC-home, and the command works well on this environment within only > single node. > > I also have a cluster of three PC servers, including node A, B, and C. > > Then, I store the '/home/pc/mypython.py' on node A. > > However, every time I issue the command to the cluster, i am always going > to get the error information like this: > > > ------------------------------------------------------------------------------------------------------------------- > Caused by: java.io.IOException: Cannot run program "/home/pc/mypython.py": > java.io.IOException: error=2, No such file or directory > at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) > at > org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:279) > ... 20 more > Caused by: java.io.IOException: java.io.IOException: error=2, No such > file or directory > at java.lang.UNIXProcess.(UNIXProcess.java:148) > at java.lang.ProcessImpl.start(ProcessImpl.java:65) > at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) > ... 21 more > > > ------------------------------------------------------------------------------------------------------------------- > By looking up the Job logs, these errors were reported by node B and node > C. It seems that the tasktracker B and C can not find the script. > On hive wiki, I didn't find any instruction on how to place the user > script. > What should I do to place my script in proper place? > Thanks in advance for any reply! > > 2011-03-01 > > > > Jianhua Wang > -- Roberto Congiu -Data Engineer - OpenX 20 E Del Mar blvd, Pasadena, CA
