OK, got it. I missed the HRegionServers.next() in the mix. It calls the RegionScanner.next(results) and that uses the batch. Tricksy! I should have started on the client side instead.
Lars On Fri, Nov 26, 2010 at 3:08 AM, Ryan Rawson <ryano...@gmail.com> wrote: > No, batch size when limit is set is 1. You get partial results for a route, > then get more from the same row. Then the next row. > On Nov 25, 2010 4:54 PM, "Lars George" <lars.geo...@gmail.com> wrote: >> Mkay, I will look into it more for the latter. But for the limit this is > still confusing to me as limit == batch and that is in he client side the > number of rows. But not the number of columns. Does that mean if I had 100 > columns and set batch to 10 that it would only return 10 rows with 10 > columns but not what I would have expected ie. 10 rows with all columns? Is > this implicitly mean batch is also the intra row batch size? >> >> Lars >> >> On Nov 25, 2010, at 21:53, Ryan Rawson <ryano...@gmail.com> wrote: >> >>> limit is for retrieving partial results of a row. Ie: give me a row >>> in chunks. Filters that want to operate on the entire row cannot be >>> used with this mode. i forget why it's in the loop but there was a >>> good reason at the time. >>> >>> -ryan >>> >>> On Thu, Nov 25, 2010 at 10:51 AM, Lars George <lars.geo...@gmail.com> > wrote: >>>> Does hbase-dev still get forwarded? Did you see the below message? >>>> >>>> ---------- Forwarded message ---------- >>>> From: Lars George <lars.geo...@gmail.com> >>>> Date: Tue, Nov 23, 2010 at 4:25 PM >>>> Subject: HRegion.RegionScanner.nextInternal() >>>> To: hbase-...@hadoop.apache.org >>>> >>>> Hi, >>>> >>>> I am officially confused: >>>> >>>> byte [] nextRow; >>>> do { >>>> this.storeHeap.next(results, limit - results.size()); >>>> if (limit > 0 && results.size() == limit) { >>>> if (this.filter != null && filter.hasFilterRow()) throw >>>> new IncompatibleFilterException( >>>> "Filter with filterRow(List<KeyValue>) incompatible >>>> with scan with limit!"); >>>> return true; // we are expecting more yes, but also >>>> limited to how many we can return. >>>> } >>>> } while (Bytes.equals(currentRow, nextRow = peekRow())); >>>> >>>> This is from the nextInternal() call. Questions: >>>> >>>> a) Why is that check for the filter and limit both being set inside the > loop? >>>> >>>> b) if "limit" is the batch size (which for a Get is "-1", not "1" as I >>>> would have thought) then what does that "limit - results.size()" >>>> achieve? >>>> >>>> I mean, this loops gets all columns for a given row, so batch/limit >>>> should not be handled here, right? what if limit were set to "1" by >>>> the client? Then even if the Get had 3 columns to retrieve it would >>>> not be able to since this limit makes it bail out. So there would be >>>> multiple calls to nextInternal() to complete what could be done in one >>>> loop? >>>> >>>> Eh? >>>> >>>> Lars >>>> >