[
https://issues.apache.org/jira/browse/HADOOP-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576204#action_12576204
]
Enis Soztutar commented on HADOOP-2834:
---------------------------------------
We do not offer an Iterator for MapFiles, but use MapFile.Reader#next().
Wouldn't it be better if we (a) add Iterator to MapFile.Reader and apply this
patch or (b) change this patch to define MapFileOutputFormat.Reader instead of
Iterators, so that reading from MapFile and MapFileOutputFormat is consistent.
one more minor issue : I think we should change the generics to :
{code}
private static final class IteratorEntry<K extends WritableComparable, V
extends Writable> implements Entry<K, V> {
...
}
private static final class MapFileOutputFormatIterator<K extends
WritableComparable, V extends Writable> implements Iterator<Entry<K, V>> {
...
}
public static<K extends WritableComparable, V extends Writable>
Iterator<Entry<K, V>>
getIterator(Path dir, Configuration conf) throws IOException {
...
}
{code}
so that we can use :
{code}
Iterator<Entry<Text, Text>> x = MapFileOutputFormat.getIterator(path, conf);
{code}
> Iterator for MapFileOutputFormat
> --------------------------------
>
> Key: HADOOP-2834
> URL: https://issues.apache.org/jira/browse/HADOOP-2834
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Andrzej Bialecki
> Fix For: 0.17.0
>
> Attachments: map-file-v2.patch, map-file-v3.patch
>
>
> MapFileOutputFormat produces output data that is sorted locally in each
> part-NNNNN file - however, there is no easy way to iterate over keys from all
> parts in a globally ascending order.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.