[ http://issues.apache.org/jira/browse/HADOOP-403?page=comments#action_12425381 ] Owen O'Malley commented on HADOOP-403: --------------------------------------
I'm very uncomfortable with passing the Reporter and OutputCollector via the JobConf. It does two bad things: 1. It passes "real" Java objects around in the JobConf, which breaks the assumption that the JobConf can be serialized successfully. (In this case, it is ok because it won't cross the process boundary, but it breaks the developers expectations.) 2. It hides the information that application writers need in a very hidden place. If I look at Mapper or Reducer, I won't see the information that I have available. Only if I scan through the HUGE JobConf API will I see the fact that they are avaiable. I strongly suggest that we just take the hit and extend the Closeable interface. I'd propose: 1. Making the Closeable.close() method depricated. 2. Add a new Closeable.close(OutputCollector, Reporter) method. 3. In MapReduceBase provide a default implementation that calls the close() method. That should minimize the breakage in user code and still make the intended interface clear. > close method in a Mapper should be provided with OutputCollector and a > Reporter > ------------------------------------------------------------------------------- > > Key: HADOOP-403 > URL: http://issues.apache.org/jira/browse/HADOOP-403 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Affects Versions: 0.5.0 > Environment: all > Reporter: Milind Bhandarkar > Assigned To: Milind Bhandarkar > Fix For: 0.5.0 > > > For mappers with side-effects, or mappers that work as aggregators (i.e. no > output on individual key-value pairs, but an aggregate output at the end of > all key-value pairs), output should be performed in the close method. For > this purpose, we need to supply output collector and reporter to the close > method of Mapper. This involves interface change, though. Thoughts ? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
