Actually add file
is the correct command. Ashish ________________________________ From: Min Zhou [mailto:[email protected]] Sent: Monday, May 25, 2009 1:11 AM To: [email protected] Subject: Re: Hive and Hadoop streaming Hey Zheng, I don't think hive support 'add jar' command right now, cauz code on this issue hasnot been committed yet. check it out at: https://issues.apache.org/jira/browse/HIVE-338 On Mon, May 25, 2009 at 3:59 PM, Manhee Jo <[email protected]<mailto:[email protected]>> wrote: Thank you so much!!! ----- Original Message ----- From: Zheng Shao<mailto:[email protected]> To: [email protected]<mailto:[email protected]> Sent: Monday, May 25, 2009 4:33 PM Subject: Re: Hive and Hadoop streaming In this case, you just need to compile your .java into a jar file, and do add jar fullpath/to/myprogram.jar; SELECT TRANSFORM(col1, col2, col3, col4) USING "java -cp myprogram.jar WeekdayMapper" AS (outcol1, outcol2, outcol3, outcol4)" Let us know if it works out or not. Zheng On Sun, May 24, 2009 at 10:50 PM, Manhee Jo <[email protected]<mailto:[email protected]>> wrote: Thank you Zheng, Here is my WeekdayMapper.java, which is just a test that does almost same thing as the "weekday_mapper.py" does. As you see below, it does not take WritableComparable nor Writable class. It receives the 4 columns just string arguments. Any advice would be very appreciated. /** * WeekdayMapper.java */ import java.io.*; import java.util.*; class WeekdayMapper { public static void main (String[] args) throws IOException { Scanner stdIn = new Scanner(System.in); String line=null; String[] column; long unixTime; Date d; GregorianCalendar cal1 = new GregorianCalendar(); while (stdIn.hasNext()) { line = stdIn.nextLine(); column = line.split("\t"); unixTime = Long.parseLong(column[3]); d = new Date(unixTime*1000); cal1.setTime(d); int dow = cal1.get(Calendar.DAY_OF_WEEK); System.out.println(column[0] + "\t" + column[1] + "\t" + column[2] + "\t" + dow); } } } Thanks, Manhee ----- Original Message ----- From: Zheng Shao<mailto:[email protected]> To: [email protected]<mailto:[email protected]> Sent: Monday, May 25, 2009 10:28 AM Subject: Re: Hive and Hadoop streaming How do your java map function receive the 4 columns? I assume your java map function takes a WritableComparable key and Writable value. Zheng 2009/5/24 Manhee Jo <[email protected]<mailto:[email protected]>> I have some mappers already coded in Java. So I want to use it as much as possible in Hive environment. Then, how can I call a Java mapper to "select transform" in Hive? For example, what is wrong with the query below and why? INSERT OVERWRITE TABLE u_data_new SELECT TRANSFORM (userid, movieid, rating, unixtime) USING 'java WeekdayMapper' AS (userid, movieid, rating, weekday) FROM u_data; Thank you. Regards, Manhee -- Yours, Zheng -- Yours, Zheng -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com
