On Sun, Jul 1, 2012 at 11:57 PM, Sukant Hajra <[email protected]> wrote: > Excerpts from Sukant Hajra's message of Thu Jun 28 15:49:11 -0500 2012: >> >> The Accumulo documentation alludes to the problem a little: >> >> If the results are unordered this is quite effective as the first results >> to arrive are as good as any others to the user. >> >> In our case, order matters because we want the last results without pulling >> in >> everything. > > Actually, I was just thinking about this a little. I don't know if this is > specified in the documentation, but is there /any/ reliable (deterministic) > ordering for the values returned by intersecting iterators?
Unrelated to the intersecting iterator, when using the batch scanner you can not expect results in order. The batch scanner send querys out to tablet servers in parallel. As batches of key/values are returned from the tablet server they are immediately made available to the client. Therefore the client will iterate over interleaved key values from different tablets. The batch scanner is usually used with the intersecting iterator to parallelize scans. I think this is documented in the batch scanner java docs. If the regular scanner were used, then the client would see the key values in the order returned by the iterator. However, only one tablet would be scanned at a time. > > If there is, would it be horribly ill-advised to rely on this ordering for > application logic if we got clever with our schema? > > Also, if someone could reply with the exact algorithm for this ordering, it > would help put less burden on us to reverse engineer and/or read the source > code correctly. > > Thanks for your help, > Sukant
