milleruntime commented on a change in pull request #2347:
URL: https://github.com/apache/accumulo/pull/2347#discussion_r743731017



##########
File path: 
core/src/main/java/org/apache/accumulo/core/iterators/SortedMapIterator.java
##########
@@ -36,28 +37,40 @@
  *
  * Note that this class is intended as an in-memory replacement for 
RFile$Reader, so its behavior
  * reflects the same assumptions; namely, that this iterator is not 
responsible for respecting the
- * columnFamilies passed into seek().
+ * columnFamilies passed into seek(). If you want a Map-backed Iterator that 
returns only sought
+ * CFs, construct a new ColumnFamilySkippingIterator(new 
SortedMapIterator(map)).
+ *
+ * @see org.apache.accumulo.core.iterators.ColumnFamilySkippingIterator
  */
-public class SortedMapIterator implements SortedKeyValueIterator<Key,Value> {
+public class SortedMapIterator implements InterruptibleIterator {
   private Iterator<Entry<Key,Value>> iter;
   private Entry<Key,Value> entry;
 
   private SortedMap<Key,Value> map;
   private Range range;
 
+  private AtomicBoolean interruptFlag;

Review comment:
       We should be able to make this final with an initial value of `false`. I 
haven't really touched the iterators since most of them have been around since 
forever but I think this would be a good fix. FYI I had a hell of a time 
tracking down where we actually do the interrupt. As a reference, it seems like 
the only place we actually interrupt iterators is in `ScanDataSource`:
   
https://github.com/apache/accumulo/blob/8a636a3ba91f5dae1d8b09b095178889a7d79c1d/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java#L256-L258
   When a tablet is closed:
   
https://github.com/apache/accumulo/blob/b5ca15806659efbbd6f5d158be8648cc2cd13593/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java#L1288
   
   We have a ton of calls to pass the flag around between iterators, readers 
and RFile code but that seems to be the only place where the actual interrupt 
takes place.

##########
File path: 
core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilySkippingIterator.java
##########
@@ -16,7 +16,7 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-package org.apache.accumulo.core.iteratorsImpl.system;
+package org.apache.accumulo.core.iterators;

Review comment:
       That is what I had originally thought but after investigating this bug 
it seems like `SortedMapIterator` and this one could be cast to an 
InterruptibleIterator in the `LocalityGroupIterator`
   
https://github.com/apache/accumulo/blob/8d33bc4d489880f15950189d20287262ce46e68e/core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/LocalityGroupIterator.java#L49
   
   I think if a user extends one of them and configures it to run in certain 
situations it could be created through deepCopy(). But it also seems dangerous 
either way.

##########
File path: 
core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilySkippingIterator.java
##########
@@ -16,7 +16,7 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-package org.apache.accumulo.core.iteratorsImpl.system;
+package org.apache.accumulo.core.iterators;

Review comment:
       Also, my original intention in #1411 was since SortedMapIterator was 
used so much in our Tests that it would be useful to users as well. But I could 
be wrong in that thinking. It could be unsafe to expose it to users so a better 
option may be to move it in with the other system iterators. I can't remember 
if I attempted this or how difficult this would be though.

##########
File path: 
core/src/main/java/org/apache/accumulo/core/iterators/SortedMapIterator.java
##########
@@ -36,28 +37,40 @@
  *
  * Note that this class is intended as an in-memory replacement for 
RFile$Reader, so its behavior
  * reflects the same assumptions; namely, that this iterator is not 
responsible for respecting the
- * columnFamilies passed into seek().
+ * columnFamilies passed into seek(). If you want a Map-backed Iterator that 
returns only sought
+ * CFs, construct a new ColumnFamilySkippingIterator(new 
SortedMapIterator(map)).
+ *
+ * @see org.apache.accumulo.core.iterators.ColumnFamilySkippingIterator
  */
-public class SortedMapIterator implements SortedKeyValueIterator<Key,Value> {
+public class SortedMapIterator implements InterruptibleIterator {
   private Iterator<Entry<Key,Value>> iter;
   private Entry<Key,Value> entry;
 
   private SortedMap<Key,Value> map;
   private Range range;
 
+  private AtomicBoolean interruptFlag;

Review comment:
       I thought about changing the `AtomicBoolean` to be final but now I am 
wondering if it was written this way with performance in mind. We do a lot of 
passing around of the flag across the iterator stack but only ever set it in 
one place.  If we were to make it final, then I think that would be a lot of 
extra memory accesses to the atomic value vs just passing around a reference.
   
   I thought of doing something like this in the setter, but like I said, I 
think it would be less efficient:
   <pre>
   public void setInterruptFlag(AtomicBoolean flag) {
       boolean newFlag = flag.get();
       boolean previousFlag = this.interruptFlag.getAndSet(newFlag);
       if (previousFlag == newFlag && newFlag)
         log.warn("Interrupt called on iterator that was previously 
interrupted.");
     }
   </pre>




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to