[jira] [Commented] (ACCUMULO-759) remove priority setting for scan-time iterators

Christopher Tubbs (JIRA) Tue, 11 Sep 2012 17:02:09 -0700

    [ 
https://issues.apache.org/jira/browse/ACCUMULO-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453566#comment-13453566
 ]


Christopher Tubbs commented on ACCUMULO-759:
--------------------------------------------

I see the value in treating the Scanner as an immutable view of a dataset 
within client code without interference from per-table config. However, I think 
it would be a simple matter to subclass Scanner for this purpose. A Scanner is 
a scanner over a data source, it is not strictly a dataset. I believe I spoke 
to Adam previously about creating such an API... one where you would manipulate 
a Query object representing a data source, and then executing it. Perhaps 
that's still a reasonable option?

It still would be reasonable to have Scanner have built-in support for such 
things like "after all per-table iterators". Perhaps priority isn't the best 
way to represent it, though? Keith and I talked about possibly creating an API 
where iterators are constructed more like:
{code:java}
IteratorSetting a, b, c;
IteratorChain chain = new IteratorChain();
chain.insertAfter(LAST, a);
chain.insertBefore(a.getName(), b);
chain.insertAfter(b.getName(), c);
{code}

One other thing to consider is that any change might want to be consistent 
across all APIs... including that pertaining to per-table configuration, and in 
things like the tableOperations.compact() method.
                
> remove priority setting for scan-time iterators
> -----------------------------------------------
>
>                 Key: ACCUMULO-759
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-759
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Adam Fuchs
>              Labels: newbie
>
> Iterators have a priority setting that allows a user to order iterators 
> arbitrarily. However that priority is an integer that doesn't directly convey 
> the iterator's relationship to other iterators. I would postulate that nobody 
> has ever needed to sneak in a scan-time iterator underneath a configured 
> table iterator (please let me know if I'm wrong about this), and the effect 
> of doing so is not easy to calculate. Many people have chosen a bad iterator 
> priority and seen commutativity problems with previously configured iterators.
> I propose that we use more of an agglomerative approach to configuring 
> scan-time iterators, in which the order of the iterator tree is the same 
> order in which the addScanIterator method is called, and all scan-time 
> iterators apply after the configured iterators apply. The change to the API 
> should just be to remove the priority number, and the existing 
> IteratorSetting constructor and accessors should be deprecated.
> With this change, we can think of an iterator as more of a functional 
> modification to a data set, as in T' = f(T) or T'' = g(f(T)). This should 
> make it easier for developers to use iterators correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ACCUMULO-759) remove priority setting for scan-time iterators

Reply via email to