milleruntime commented on a change in pull request #2347:
URL: https://github.com/apache/accumulo/pull/2347#discussion_r743731017
##########
File path:
core/src/main/java/org/apache/accumulo/core/iterators/SortedMapIterator.java
##########
@@ -36,28 +37,40 @@
*
* Note that this class is intended as an in-memory replacement for
RFile$Reader, so its behavior
* reflects the same assumptions; namely, that this iterator is not
responsible for respecting the
- * columnFamilies passed into seek().
+ * columnFamilies passed into seek(). If you want a Map-backed Iterator that
returns only sought
+ * CFs, construct a new ColumnFamilySkippingIterator(new
SortedMapIterator(map)).
+ *
+ * @see org.apache.accumulo.core.iterators.ColumnFamilySkippingIterator
*/
-public class SortedMapIterator implements SortedKeyValueIterator<Key,Value> {
+public class SortedMapIterator implements InterruptibleIterator {
private Iterator<Entry<Key,Value>> iter;
private Entry<Key,Value> entry;
private SortedMap<Key,Value> map;
private Range range;
+ private AtomicBoolean interruptFlag;
Review comment:
We should be able to make this final with an initial value of `false`. I
haven't really touched the iterators since most of them have been around since
forever but I think this would be a good fix. FYI I had a hell of a time
tracking down where we actually do the interrupt. As a reference, it seems like
the only place we actually interrupt iterators is in `ScanDataSource`:
https://github.com/apache/accumulo/blob/8a636a3ba91f5dae1d8b09b095178889a7d79c1d/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/ScanDataSource.java#L256-L258
When a tablet is closed:
https://github.com/apache/accumulo/blob/b5ca15806659efbbd6f5d158be8648cc2cd13593/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java#L1288
We have a ton of calls to pass the flag around between iterators, readers
and RFile code but that seems to be the only place where the actual interrupt
takes place.
##########
File path:
core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilySkippingIterator.java
##########
@@ -16,7 +16,7 @@
* specific language governing permissions and limitations
* under the License.
*/
-package org.apache.accumulo.core.iteratorsImpl.system;
+package org.apache.accumulo.core.iterators;
Review comment:
That is what I had originally thought but after investigating this bug
it seems like `SortedMapIterator` and this one could be cast to an
InterruptibleIterator in the `LocalityGroupIterator`
https://github.com/apache/accumulo/blob/8d33bc4d489880f15950189d20287262ce46e68e/core/src/main/java/org/apache/accumulo/core/iteratorsImpl/system/LocalityGroupIterator.java#L49
I think if a user extends one of them and configures it to run in certain
situations it could be created through deepCopy(). But it also seems dangerous
either way.
##########
File path:
core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilySkippingIterator.java
##########
@@ -16,7 +16,7 @@
* specific language governing permissions and limitations
* under the License.
*/
-package org.apache.accumulo.core.iteratorsImpl.system;
+package org.apache.accumulo.core.iterators;
Review comment:
Also, my original intention in #1411 was since SortedMapIterator was
used so much in our Tests that it would be useful to users as well. But I could
be wrong in that thinking. It could be unsafe to expose it to users so a better
option may be to move it in with the other system iterators. I can't remember
if I attempted this or how difficult this would be though.
##########
File path:
core/src/main/java/org/apache/accumulo/core/iterators/SortedMapIterator.java
##########
@@ -36,28 +37,40 @@
*
* Note that this class is intended as an in-memory replacement for
RFile$Reader, so its behavior
* reflects the same assumptions; namely, that this iterator is not
responsible for respecting the
- * columnFamilies passed into seek().
+ * columnFamilies passed into seek(). If you want a Map-backed Iterator that
returns only sought
+ * CFs, construct a new ColumnFamilySkippingIterator(new
SortedMapIterator(map)).
+ *
+ * @see org.apache.accumulo.core.iterators.ColumnFamilySkippingIterator
*/
-public class SortedMapIterator implements SortedKeyValueIterator<Key,Value> {
+public class SortedMapIterator implements InterruptibleIterator {
private Iterator<Entry<Key,Value>> iter;
private Entry<Key,Value> entry;
private SortedMap<Key,Value> map;
private Range range;
+ private AtomicBoolean interruptFlag;
Review comment:
I thought about changing the `AtomicBoolean` to be final but now I am
wondering if it was written this way with performance in mind. We do a lot of
passing around of the flag across the iterator stack but only ever set it in
one place. If we were to make it final, then I think that would be a lot of
extra memory accesses to the atomic value vs just passing around a reference.
I thought of doing something like this in the setter, but like I said, I
think it would be less efficient:
<pre>
public void setInterruptFlag(AtomicBoolean flag) {
boolean newFlag = flag.get();
boolean previousFlag = this.interruptFlag.getAndSet(newFlag);
if (previousFlag == newFlag && newFlag)
log.warn("Interrupt called on iterator that was previously
interrupted.");
}
</pre>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]