[ 
https://issues.apache.org/jira/browse/HADOOP-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574010#action_12574010
 ] 

Owen O'Malley commented on HADOOP-1230:
---------------------------------------

{quote}
  1. Closeable is passed a context - is this needed? Also, is it true that not 
all methods will work there - e.g. collect?
{quote}

For pipes and streaming to work, it has to be fair to call collect up until the 
close finishes. For a certain style of applications, they need to trail the 
input and they need to handle the last set of records in the close. Currently, 
most of them keep a handle on the collector to use in the close, but it is 
better to give it to them explicitly.

{quote}
   2. We've lost the Iterator interface in Reducers - it would be nice to keep 
this, if possible, as it's a standard Java idiom and people expect to be able 
to iterate using "foreach".
{quote}

Hmm. I forgot that we do that. I guess I'd prefer to add a new method rather 
than make ReduceContext implement Iterator.

{code}
   Iterator<VALUEIN> getValues() throws IOException;
{code}

{quote}
   3. If we're creating a new JobConf then it might be a good opportunity to 
reconsider its interface if there are things we want to change (not sure if 
this is true).
{quote}

Going through JobConf is another big job that should probably be done at some 
point. The width of that interface makes it somewhat problematic.

{quote}
   4. The formal type parameters have lowercase letters, leading to possible 
confusion with types. From the Java Generics Tutorial:
{quote}

I really find the types hard to read if they are K1, V1. I guess I could make 
them all uppercase KEYIN, VALUEIN, KEYOUT, VALUEOUT. 



> Replace parameters with context objects in Mapper, Reducer, Partitioner, 
> InputFormat, and OutputFormat classes
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1230
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: context-objs.patch
>
>
> This is a big change, but it will future-proof our API's. To maintain 
> backwards compatibility, I'd suggest that we move over to a new package name 
> (org.apache.hadoop.mapreduce) and deprecate the old interfaces and package. 
> Basically, it will replace:
> package org.apache.hadoop.mapred;
> public interface Mapper extends JobConfigurable, Closeable {
>   void map(WritableComparable key, Writable value, OutputCollector output, 
> Reporter reporter) throws IOException;
> }
> with:
> package org.apache.hadoop.mapreduce;
> public interface Mapper extends Closable {
>   void map(MapContext context) throws IOException;
> }
> where MapContext has the methods like getKey(), getValue(), collect(Key, 
> Value), progress(), etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to