Re: Hive and Hadoop streaming

Manhee Jo Sun, 24 May 2009 22:50:42 -0700

Thank you Zheng,
Here is my WeekdayMapper.java, which is just a test that does almost same 
thing as the "weekday_mapper.py" does.
As you see below, it does not take WritableComparable nor Writable class. It 
receives the 4 columns just string
arguments. Any advice would be very appreciated.


/**
 *  WeekdayMapper.java
 */

import java.io.*;
import java.util.*;

class WeekdayMapper {
   public static void main (String[] args) throws IOException {
   Scanner stdIn = new Scanner(System.in);
   String line=null;
   String[] column;
   long unixTime;
   Date d;
   GregorianCalendar cal1 = new GregorianCalendar();

   while (stdIn.hasNext()) {
     line = stdIn.nextLine();
     column = line.split("\t");
     unixTime = Long.parseLong(column[3]);
     d = new Date(unixTime*1000);
     cal1.setTime(d);
     int dow = cal1.get(Calendar.DAY_OF_WEEK);
     System.out.println(column[0] + "\t" + column[1] + "\t"
          + column[2] + "\t" + dow);
    }
  }
}

Thanks,
Manhee

  ----- Original Message ----- 
  From: Zheng Shao
  To: [email protected]
  Sent: Monday, May 25, 2009 10:28 AM
  Subject: Re: Hive and Hadoop streaming


  How do your java map function receive the 4 columns?
  I assume your java map function takes a WritableComparable key and 
Writable value.

  Zheng


  2009/5/24 Manhee Jo <[email protected]>

    I have some mappers already coded in Java. So I want to use it
    as much as possible in Hive environment.
    Then, how can I call a Java mapper to "select transform" in Hive?
    For example, what is wrong with the query below and why?

    INSERT OVERWRITE TABLE u_data_new
    SELECT
      TRANSFORM (userid, movieid, rating, unixtime)
      USING 'java WeekdayMapper'
      AS (userid, movieid, rating, weekday)
    FROM u_data;

    Thank you.


    Regards,
    Manhee



  -- 
  Yours,
  Zheng

Re: Hive and Hadoop streaming

Reply via email to