Github user joshelser commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/247#discussion_r110997152
--- Diff:
core/src/main/java/org/apache/accumulo/core/iterators/OrIterator.java ---
@@ -30,36 +33,66 @@
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Range;
import org.apache.accumulo.core.data.Value;
+import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
/**
* An iterator that handles "OR" query constructs on the server side. This
code has been adapted/merged from Heap and Multi Iterators.
+ *
+ * The table structure should have the following form:
+ *
+ * <pre>
+ * row term:docId =< value
+ * </pre>
+ *
+ * This Iterator will return a sorted iteration of docIDs for a given set
of terms.
+ *
+ * For example, given the data and an OR'ing of "bob,steve":
+ *
+ * <pre>
+ * row1 bob:4
+ * row1 george:2
+ * row1 steve:3
+ * </pre>
+ *
+ * This Iterator will return:
+ *
+ * <pre>
+ * row1 bob:4
+ * row1 steve:3
+ * </pre>
*/
-public class OrIterator implements SortedKeyValueIterator<Key,Value> {
+public class OrIterator implements SortedKeyValueIterator<Key,Value>,
OptionDescriber {
--- End diff --
> I assume this was done as an optimization.
No, it's the design of the iterator. It wouldn't function correctly if it
returned in non-docId sorted order.
> Since it fully implements SKVI
Yes, it implements all the methods of SKVI (I mean, it wouldn't compile
otherwise). The issue is that it doesn't adhere to the runtime contract. While
iterating over any SKVI, the Keys should always be increasing in sort-order.
This iterator, by design, does not do that.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---