[ 
https://issues.apache.org/jira/browse/ACCUMULO-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453309#comment-13453309
 ] 

Keith Turner edited comment on ACCUMULO-759 at 9/12/12 6:21 AM:
----------------------------------------------------------------

After reading all of the comments, now I am thinking of the following.

I would avoid restricting the priority for tablet iterators to < 1024.  From an 
API perspective I think that cleanest option may be to set a list.  With a 
method like appendScanIterator() that applies we need method like 
clearScanIterators().  I think using a boolean or negative number is a behind 
the scenes implementation detail.  When the API uses a list, it does not need 
to expose either.

{code:java}
//the purpose of this class is to allow user to configure scan time 
iterators... it does not allow the user to set the priority or name.... 
class ScanIteratorSetting extends IteratorSetting {

   //has all of IteratorSetting's constructors, but priority and name are never 
arguments e.g.
   public ScanIteratorSetting(String iteratorClass) {
     super(-42, "does not matter", iteratorClass, new HashMap<String,String>());
   }

   public void setPriority(int priority) {
     throw new UnsupportedOperationsException();
   }

   public void setName(String name) {
     throw new UnsupportedOperationsException();
   }
}
{code}

Maybe ScanIteratorSetting should not extend IteratorSetting.  Maybe 
IteratorSetting and ScanIteratorSetting should have a common parent?

The scanner would keep all of its current methods and add the following method.

{code:java}

interface ScannerBase {

   /**
    * This is a legacy method, you should probably use {@link 
setIterators(...)} or {@link setIterator(...)} which are much 
    * simpler.  This method allows you to insert scan iterators before 
iterators configured for the table which is not possible
    * with {@link setIterators(...)} or {@link setIterator(...)}.
    */
   public void addScanIterator(IteratorSetting cfg);
   public void removeScanIterator(String iteratorName);
   public void updateScanIteratorOption(String iteratorName, String key, String 
value);

   /**
    * Scan time iterators that will execute server side in the order given in 
the list after all iterators configured for the table.
    * 
    * This method will overwrite iterators previously set by {@link 
setIterators(...)} or {@link setIterator(...)}
    *
    * This method will have no effect on iterators set by {@link 
addScanIterator(...)}
    */
   void setIterators(List<ScanIteratorSetting> scanIterators);

   /**
    * A convenience method for setting a single scan time iterator that will 
execute after all iterators configured for the table.
    * 
    * This method will overwrite iterators previously set by {@link 
setIterators(...)} or {@link setIterator(...)}
    *
    * This method will have no effect on iterators set by {@link 
addScanIterator(...)}
    */
   void setIterator(ScanIteratorSetting scanIterator);

}

{code}

addScanIterator() should probably throw an expception if passed a 
ScanIteratorSetting
                
      was (Author: kturner):
    After reading all of the comments, now I am thinking of the following.

I would avoid restricting the priority for tablet iterators to < 1024.  From an 
API perspective I think that cleanest option may be to set a list.  With a 
method like appendScanIterator() that applies we need method like 
clearScanIterators().  I think using a boolean or negative number is a behind 
the scenes implementation detail.  When the API uses a list, it does not need 
to expose either.

{code:java}
//the purpose of this class is to allow user to configure scan time 
iterators... it does not allow the user to set the priority or name.... 
class ScanIteratorSetting extends IteratorSetting {

   //has all of IteratorSetting's constructors, but priority and name are never 
arguments e.g.
   public ScanIteratorSetting(String iteratorClass) {
     super(-42, "does not matter", iteratorClass, new HashMap<String,String>());
   }

   public void setPriority(int priority) {
     throw new UnsupportedOperationsException();
   }

   public void setName(String name) {
     throw new UnsupportedOperationsException();
   }
}
{code}

Maybe ScanIteratorSetting should not extend IteratorSetting.  Maybe 
IteratorSetting and ScanIteratorSetting should have a common parent?

The scanner would keep all of its current methods and add the following method.

{code:java}

interface ScannerBase {

   public void addScanIterator(IteratorSetting cfg);
   public void removeScanIterator(String iteratorName);
   public void updateScanIteratorOption(String iteratorName, String key, String 
value);

   /**
    * Scan time iterators that will execute server side in the order given in 
the list after all iterators configured for the table.
    * 
    * This method will overwrite iterators previously set by {@link 
setIterators(...)} or {@link setIterator(...)}
    *
    * This method will have no effect on iterators set by {@link 
addScanIterator(...)}
    */
   void setIterators(List<ScanIteratorSetting> scanIterators);

   /**
    * A convenience method for setting a single scan time iterator that will 
execute after all iterators configured for the table.
    * 
    * This method will overwrite iterators previously set by {@link 
setIterators(...)} or {@link setIterator(...)}
    *
    * This method will have no effect on iterators set by {@link 
addScanIterator(...)}
    */
   void setIterator(ScanIteratorSetting scanIterator);

}

{code}

addScanIterator() should probably throw an expception if passed a 
ScanIteratorSetting
                  
> remove priority setting for scan-time iterators
> -----------------------------------------------
>
>                 Key: ACCUMULO-759
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-759
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Adam Fuchs
>              Labels: newbie
>
> Iterators have a priority setting that allows a user to order iterators 
> arbitrarily. However that priority is an integer that doesn't directly convey 
> the iterator's relationship to other iterators. I would postulate that nobody 
> has ever needed to sneak in a scan-time iterator underneath a configured 
> table iterator (please let me know if I'm wrong about this), and the effect 
> of doing so is not easy to calculate. Many people have chosen a bad iterator 
> priority and seen commutativity problems with previously configured iterators.
> I propose that we use more of an agglomerative approach to configuring 
> scan-time iterators, in which the order of the iterator tree is the same 
> order in which the addScanIterator method is called, and all scan-time 
> iterators apply after the configured iterators apply. The change to the API 
> should just be to remove the priority number, and the existing 
> IteratorSetting constructor and accessors should be deprecated.
> With this change, we can think of an iterator as more of a functional 
> modification to a data set, as in T' = f(T) or T'' = g(f(T)). This should 
> make it easier for developers to use iterators correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to