[jira] [Commented] (AVRO-1130) MapReduce Jobs can output write SortedKeyValueFiles directly

Doug Cutting (JIRA) Fri, 06 Jun 2014 14:11:29 -0700

    [ 
https://issues.apache.org/jira/browse/AVRO-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020379#comment-14020379
 ]


Doug Cutting commented on AVRO-1130:
------------------------------------

I'd expect this to work like Hadoop's MapFileOutputFormat, which is the latter 
of your two examples.  Note that MapFileOutputFormat#getReaders() can be used 
to open all of the files.  The array can then be accessed using the Partitioner 
that was used by the MapReduce job, e.g.:

{code}
SortedKeyValueFile.Reader<K,V>[] readers;
Partitioner<K,V> partitioner;

public V getValue(K key) throws IOException {
  return readers[partitioner.getPartition(key, null, readers.size)].get(key);
}
{code}

> MapReduce Jobs can output write SortedKeyValueFiles directly
> ------------------------------------------------------------
>
>                 Key: AVRO-1130
>                 URL: https://issues.apache.org/jira/browse/AVRO-1130
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>    Affects Versions: 1.7.1
>            Reporter: Jeremy Lewi
>            Assignee: Harsh J
>            Priority: Minor
>
> It would be nice if MapReduce jobs could write directly to 
> SortedKeyValueFile's.
> harsh@'s response on this thread http://goo.gl/OT1rN for some more 
> information on what needs to be done.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (AVRO-1130) MapReduce Jobs can output write SortedKeyValueFiles directly

Reply via email to