Why not provide a pointer to the real record reader? Seems like a
valid OO way to get access to all kinds of things.
On Aug 8, 2006, at 3:48 PM, Owen O'Malley (JIRA) wrote:
[ http://issues.apache.org/jira/browse/HADOOP-433?
page=comments#action_12426763 ]
Owen O'Malley commented on HADOOP-433:
--------------------------------------
This is largely addressed by the extensions I put into the JobConf
task localization code. Look at MapTask.localizeConfiguration.
In particular, each Mapper has available to it:
map.input.file
map.input.start
map.input.length
For application writers that don't want to read the Hadoop code,
I've put the list of attributes in:
http://wiki.apache.org/lucene-hadoop/TaskExecutionEnvironment
This will let you get an equivalent RecordReader even if it is not
the same object. Will that address your problem?
Better access to the RecordReader
---------------------------------
Key: HADOOP-433
URL: http://issues.apache.org/jira/browse/HADOOP-433
Project: Hadoop
Issue Type: Improvement
Components: mapred
Affects Versions: 0.5.0
Reporter: Benjamin Reed
Priority: Minor
The record reader has access to the FileSplit which can in turn
have information that is useful to the Mapper. For example, Map
processing may vary according to file name or attributes
associated with a file. Unfortunately, even using a MapRunner you
only have access to the progress wrapper of the RecordReader. To
get access to the real record reader I had to use a thread local
variable which I set in RecordReader.getNext(). It would be much
nicer if you could get a reference to the real RecordReader from
the RecordReader passed to MapRunner.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/
Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/
software/jira