The batch scanner works by getting batches from all tablets in the scan. This will typically result in getting sequential batches that are in non-sequential ordering. Because batches are solely based on individual key-value pairs, it is possible to get a batch that ends mid-row such that the following key is a completely different key, also possibly mid-row. If you want to guarantee entire rows, the whole row iterator can be used.
tldr; Option2 is accurate, but you can force Option1 to occur On Fri, Oct 25, 2013 at 12:59 PM, Peter Rainer <[email protected]>wrote: > Hi, > > in the BatchScanner JavaDoc it says "Also only use this *when you do not > care about the returned data being in sorted order*.* *If you want to > lookup a few ranges and expect those ranges to contain a lot of data, then > use the Scanner instead. Also, the Scanner will return data in sorted > order, this will not." > > I'm not a 100% sure how to interpret this, so I was wondering if anyone of > you could help me clarify that: > > *Option 1)* > Rows are not sorted, but all Key/Value Pairs with the same Row Key are in > sequence > > Example: > Format: Key:CF:CQ:Value > A:CF1:CQ1:1 > A:CF2:CQ2:2 > C:CF1:CQ1:1 > B:CF1:CQ1:1 > > *Option2)* > Rows are not sorted and not even Key/Value Pairs with the same Row Key are > in sequence > > Example: > Format: Key:CF:CQ:Value > A:CF1:CQ1:1 > C:CF1:CQ1:1 > A:CF2:CQ2:2 > B:CF1:CQ1:1 > > > Thanks, > Peter > >
