[
https://issues.apache.org/jira/browse/HADOOP-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605300#action_12605300
]
Tom White commented on HADOOP-3566:
-----------------------------------
I propose a subclass of RecordReader that supports retrieval of the current key
and value:
{code}
public interface NewInstanceRecordReader<K, V> extends RecordReader<K, V> {
K getCurrentKey();
V getCurrentValue();
}
{code}
With this change, the framework would need only changes to MapTask to have two
types of TrackedRecordReader, and MapRunner to check the type of RecordRunner
and retrieve the current key and value, if necessary.
Longer term, in HADOOP-1230, we could simply add these new methods to the new
RecordReader interface (which will be in a new package,
org.apache.hadoop.mapreduce).
Thoughts?
> Create an InputFormat for reading lines of text as Java Strings
> ---------------------------------------------------------------
>
> Key: HADOOP-3566
> URL: https://issues.apache.org/jira/browse/HADOOP-3566
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Tom White
> Assignee: Tom White
>
> Such a StringInputFormat would be like TextInputFormat but with input types
> of Long and String, rather than LongWritable and Text. This would allow users
> to write MapReduce programs that used only Java native types (i.e. no
> Writables).
> This is currently not possible to write without changes to Hadoop due to a
> limitation in the RecordReader interface explained here:
> https://issues.apache.org/jira/browse/HADOOP-3413?focusedCommentId=12597935#action_12597935
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.