[ 
https://issues.apache.org/jira/browse/HADOOP-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673913#action_12673913
 ] 

Jothi Padmanabhan commented on HADOOP-5266:
-------------------------------------------

Users would call values.mark() and values.reset() to use this functionality. 
However, the current Reduce API takes Iterator values as the parameter.
{code}
public void reduce(K key, Iterator<V> values,
                          OutputCollector<K, V> output, 
                          Reporter reporter)
{code}

Since the generic Iterator does not include functions mark and reset, how 
should this be handled? For example
# Change the API to take a ResettableIterator that extends Iterator and 
supports these two extra methods. However, this will break existing 
applications and so not an option
# Let the API take Iterator, but let the framework return a Resetabble Iterator 
and user code does a cast to ResettableIterator
# Some other way?

Thoughts?

> Values Iterator should support "mark" and "reset"
> -------------------------------------------------
>
>                 Key: HADOOP-5266
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5266
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Jothi Padmanabhan
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.21.0
>
>
> Some users have expressed interest in having a mark-reset functionality on 
> values iterator. Users can call mark() at any point during the iteration 
> process and a subsequent reset() should move the iterator to the last value 
> emitted when mark() was called. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to