[ 
https://issues.apache.org/jira/browse/HADOOP-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HADOOP-3566:
------------------------------

    Attachment: hadoop-3566.patch

Patch implementing the proposal. The unit test TestJavaSerialization is now 
Writable-free. So you can write mappers and reducers like this:

{code}
    static class WordCountMapper extends MapReduceBase implements
      Mapper<Long, String, String, Long> {

    public void map(Long key, String value,
        OutputCollector<String, Long> output, Reporter reporter)
        throws IOException {
      StringTokenizer st = new StringTokenizer(value);
      while (st.hasMoreTokens()) {
        output.collect(st.nextToken(), 1L);
      }
    }

  }
  
  static class SumReducer<K> extends MapReduceBase implements
      Reducer<K, Long, K, Long> {
    
    public void reduce(K key, Iterator<Long> values,
        OutputCollector<K, Long> output, Reporter reporter)
      throws IOException {

      long sum = 0;
      while (values.hasNext()) {
        sum += values.next();
      }
      output.collect(key, sum);
    }
    
  }
{code}

> Create an InputFormat for reading lines of text as Java Strings
> ---------------------------------------------------------------
>
>                 Key: HADOOP-3566
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3566
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: hadoop-3566.patch
>
>
> Such a StringInputFormat would be like TextInputFormat but with input types 
> of Long and String, rather than LongWritable and Text. This would allow users 
> to write MapReduce programs that used only Java native types (i.e. no 
> Writables).
> This is currently not possible to write without changes to Hadoop due to a 
> limitation in the RecordReader interface explained here: 
> https://issues.apache.org/jira/browse/HADOOP-3413?focusedCommentId=12597935#action_12597935

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to