[jira] Updated: (AVRO-581) java: add reducer that separates keys and values when map output is pairs

Doug Cutting (JIRA) Fri, 16 Jul 2010 13:40:47 -0700

     [ 
https://issues.apache.org/jira/browse/AVRO-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Doug Cutting updated AVRO-581:
------------------------------

    Attachment: AVRO-581.patch

This patch implements the proposal.  It also includes the patch for AVRO-580.

Map inputs and final outputs may be any type.  Map outputs for non-map-only 
jobs must be Pairs.  The default mapper casts inputs to pairs.  The default 
reducer pairs every value with its key.

TestWordCount illustrates this new API.  The prior mapreduce API in Avro is 
replaced.


> java: add reducer that separates keys and values when map output is pairs
> -------------------------------------------------------------------------
>
>                 Key: AVRO-581
>                 URL: https://issues.apache.org/jira/browse/AVRO-581
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>
>         Attachments: AVRO-581.patch
>
>
> We should add a Pair<K,V> class, implementing SpecificRecord, that combines 
> instances of two schemas (specific or generic).  Pairs would be compared by 
> key, ignoring value.  The template for its schema would be:
> {code}
> {"type": "record", "name": "org.apache.avro.mapred.Pair", "fields":[
>   {"name": "key", "type":" <<insert key schema here>>},
>   {"name": "value", "order": "ignore", "type": <<insert value schema>>}
> ]}
> {code}
> When map outputs are instances of this class, a reducer may be used whose 
> reduce method is something like:
> public abstract void reduce(K key, Iterable<V> values);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (AVRO-581) java: add reducer that separates keys and values when map output is pairs

Reply via email to