Scan-time iterators returning out-of-order rows

Russ Weeks Wed, 01 Apr 2015 17:04:34 -0700

A wonderful property of scan-time iterators is that they can emit row IDs
in arbitrary order. Before I go off and build an index that relies on this
behaviour, I'd like to get a sense of how likely it is to exist in future
versions of Accumulo.


I'd like to build an index like this (hopefully the ascii comes through, if
not check here <https://gist.github.com/anonymous/1a64114da4b68a2ec822>):


 row   | cf  | cq                | val
-------------------------------------------------
 p0    | i   | (prop_a, 7, r15)  | 1
 p0    | i   | (prop_a, 8, r8)   | 1
 p0    | i   | (prop_a, 9, r19)  | 1
[...snip...]
 p0    | d   | (r8, prop_a)      | 8
 p0    | d   | (r8, prop_b)      | hello, world
 p0    | d   | (r15, prop_a)     | 7
 p0    | d   | (r15, prop_b)     | just testing
 p0    | d   | (r19, prop_a)     | 9
 p0    | d   | (r19, prop_b)     | something else

Which is a pretty conventional partitioned index. I'd like to be able to
issue a query like, "Tell me about prop_b for all documents where prop_a <
9" but I'm pretty sure that the only way this could work at scale is if
it's OK for the iterator to return (p0, r15, prop_b, "just testing")
followed by (p0, r8, prop_b, "hello, world").

This works today - if you folks see any flaws in my reasoning please let me
know - my question is, do you see this as functionality that should be
preserved in the future?

Thanks,
-Russ

Scan-time iterators returning out-of-order rows

Reply via email to