[
https://issues.apache.org/jira/browse/ACCUMULO-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552917#comment-13552917
]
Keith Turner commented on ACCUMULO-625:
---------------------------------------
For the unique column case, I think it would be ok if the iterator considered
more than a row. The iterator could drop any key that contains a column it has
seen before. It would start w/ an empty set of seen columns each time its
initialized. The M/R job would still need to do the final unique. The iterator
would just do a lot of filtering. If we supported stateful iterators, then
maybe the 100 most recently seen columns could be maintained in that state
across iterator sessions.
> consider augmenting session state with "breadcrumbs"
> ----------------------------------------------------
>
> Key: ACCUMULO-625
> URL: https://issues.apache.org/jira/browse/ACCUMULO-625
> Project: Accumulo
> Issue Type: Improvement
> Components: tserver
> Reporter: Eric Newton
> Assignee: Keith Turner
>
> Presently, the iterator stack can be created and destroyed at the whim of the
> tserver and its buffering needs. In complex iterations, lower-level
> iterators can make significant progress which is not inherently obvious in
> any returned key. When the iterator stack is re-created to continue a query,
> the last key returned is used to {{seek()}} the iterators. Lower-level
> iterators must re-scan their data to move back to the old position.
> Consider a mechanism to save progress beyond the last key returned.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira