A wonderful property of scan-time iterators is that they can emit row IDs in arbitrary order. Before I go off and build an index that relies on this behaviour, I'd like to get a sense of how likely it is to exist in future versions of Accumulo.
I'd like to build an index like this (hopefully the ascii comes through, if not check here <https://gist.github.com/anonymous/1a64114da4b68a2ec822>): row | cf | cq | val ------------------------------------------------- p0 | i | (prop_a, 7, r15) | 1 p0 | i | (prop_a, 8, r8) | 1 p0 | i | (prop_a, 9, r19) | 1 [...snip...] p0 | d | (r8, prop_a) | 8 p0 | d | (r8, prop_b) | hello, world p0 | d | (r15, prop_a) | 7 p0 | d | (r15, prop_b) | just testing p0 | d | (r19, prop_a) | 9 p0 | d | (r19, prop_b) | something else Which is a pretty conventional partitioned index. I'd like to be able to issue a query like, "Tell me about prop_b for all documents where prop_a < 9" but I'm pretty sure that the only way this could work at scale is if it's OK for the iterator to return (p0, r15, prop_b, "just testing") followed by (p0, r8, prop_b, "hello, world"). This works today - if you folks see any flaws in my reasoning please let me know - my question is, do you see this as functionality that should be preserved in the future? Thanks, -Russ
