Re: Row filters

Ryan Rawson Mon, 15 Jun 2009 01:58:04 -0700

And let me follow up a bit...

The best configuration for a m-r job is to have the # of map tasks = # of
regions in the table.  While a scanner can iterate between regions, once the
table size gets really big, it's best in my experience, more reliable as
well, to have a 1:1 correspondence between map tasks and regions.


-ryan

On Mon, Jun 15, 2009 at 1:55 AM, Ryan Rawson <[email protected]> wrote:

> Hey,
>
> The client-side scanner code already will move it to the next region when
> it hits the end of a region.
>
> -ryan
>
>
>
> On Mon, Jun 15, 2009 at 1:52 AM, Piotr Praczyk <[email protected]>wrote:
>
>> 2009/6/12 stack <[email protected]>
>>
>> > On Fri, Jun 12, 2009 at 8:41 AM, Erik Holstad <[email protected]>
>> > wrote:
>> >
>> > > ...
>> > > not really sure how this
>> > > was done in 0.19 and earlier.
>> >
>> >
>> > There's a stoprow filter in 0.19.x and earlier.  There is also a
>> getScanner
>> > override that takes a start and stop row in 0.19.x (under the wraps it
>> uses
>> > stop row filter -- check the client source).
>> > St>Ack
>> >
>>
>> Thanks :-) It was very helpful.
>> Do you know if there is any standard Scanner allowing to iterate over more
>> than one table fragments ? [when one chunk finishes, jumping to the
>> beginning of another] Or rather should I implement it myself ?
>>
>>
>> Piotr
>>
>
>

Re: Row filters

Reply via email to