Yes in this case 'batch' and 'limit' refer to how many cells to return
at a time within a row.  The 'scanner caching' comes across in the
next(int) argument which can change on a per-call basis (although the
HTable API doesnt quite allow it).

-ryan

On Fri, Nov 26, 2010 at 3:12 AM, Lars George <lars.geo...@gmail.com> wrote:
> OK, got it. I missed the HRegionServers.next() in the mix. It calls
> the RegionScanner.next(results) and that uses the batch. Tricksy! I
> should have started on the client side instead.
>
> Lars
>
> On Fri, Nov 26, 2010 at 3:08 AM, Ryan Rawson <ryano...@gmail.com> wrote:
>> No, batch size when limit is set is 1. You get partial results for a route,
>> then get more from the same row. Then the next row.
>> On Nov 25, 2010 4:54 PM, "Lars George" <lars.geo...@gmail.com> wrote:
>>> Mkay, I will look into it more for the latter. But for the limit this is
>> still confusing to me as limit == batch and that is in he client side the
>> number of rows. But not the number of columns. Does that mean if I had 100
>> columns and set batch to 10 that it would only return 10 rows with 10
>> columns but not what I would have expected ie. 10 rows with all columns? Is
>> this implicitly mean batch is also the intra row batch size?
>>>
>>> Lars
>>>
>>> On Nov 25, 2010, at 21:53, Ryan Rawson <ryano...@gmail.com> wrote:
>>>
>>>> limit is for retrieving partial results of a row. Ie: give me a row
>>>> in chunks. Filters that want to operate on the entire row cannot be
>>>> used with this mode. i forget why it's in the loop but there was a
>>>> good reason at the time.
>>>>
>>>> -ryan
>>>>
>>>> On Thu, Nov 25, 2010 at 10:51 AM, Lars George <lars.geo...@gmail.com>
>> wrote:
>>>>> Does hbase-dev still get forwarded? Did you see the below message?
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Lars George <lars.geo...@gmail.com>
>>>>> Date: Tue, Nov 23, 2010 at 4:25 PM
>>>>> Subject: HRegion.RegionScanner.nextInternal()
>>>>> To: hbase-...@hadoop.apache.org
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am officially confused:
>>>>>
>>>>> byte [] nextRow;
>>>>> do {
>>>>> this.storeHeap.next(results, limit - results.size());
>>>>> if (limit > 0 && results.size() == limit) {
>>>>> if (this.filter != null && filter.hasFilterRow()) throw
>>>>> new IncompatibleFilterException(
>>>>> "Filter with filterRow(List<KeyValue>) incompatible
>>>>> with scan with limit!");
>>>>> return true; // we are expecting more yes, but also
>>>>> limited to how many we can return.
>>>>> }
>>>>> } while (Bytes.equals(currentRow, nextRow = peekRow()));
>>>>>
>>>>> This is from the nextInternal() call. Questions:
>>>>>
>>>>> a) Why is that check for the filter and limit both being set inside the
>> loop?
>>>>>
>>>>> b) if "limit" is the batch size (which for a Get is "-1", not "1" as I
>>>>> would have thought) then what does that "limit - results.size()"
>>>>> achieve?
>>>>>
>>>>> I mean, this loops gets all columns for a given row, so batch/limit
>>>>> should not be handled here, right? what if limit were set to "1" by
>>>>> the client? Then even if the Get had 3 columns to retrieve it would
>>>>> not be able to since this limit makes it bail out. So there would be
>>>>> multiple calls to nextInternal() to complete what could be done in one
>>>>> loop?
>>>>>
>>>>> Eh?
>>>>>
>>>>> Lars
>>>>>
>>
>

Reply via email to