On Thu, Mar 25, 2010 at 11:30 PM, Karthik K <[email protected]> wrote:

>
>
> On Thu, Mar 25, 2010 at 8:03 PM, Andriy Kolyadenko <
> [email protected]> wrote:
>
>> My task is following: I have the list of key ranges and I need to perform
>> MR for this ranges as fast as possible.
>
>
>
>> As far as I understand MR will do full scan if I will use filter. Is it
>> correct?
>
>
> On a given InputSplit, yes.
>
> But, see HBASE-2302 , where you can inherit from TableInputFormat and
> override , a method to reduce the number of InputSplits.
> That will significantly reduce the overhead of the bulk scan, and restrict
> your filter only to those inputsplits, passing the criteria.
>
>
> *
> *
>
>>
>>
>>
>> --- [email protected] wrote:
>>
>> From: Stack <[email protected]>
>> To: "[email protected]" <[email protected]>
>> Subject: Re: Multi ranges Scan
>> Date: Thu, 25 Mar 2010 19:57:44 -0700
>>
>> Can you use a filter to do this?  If no pattern to the excludes then
>> it's tougher. How do you know what to exclude?   It's in a repository
>> somewhere?  Add a filter to query this repo?
>>
>>
>>
>> On Mar 25, 2010, at 4:07 PM, "Andriy Kolyadenko" <
>> [email protected]
>>  > wrote:
>>
>> > Ok, it would work for regions pruning. And what about actual rows
>> > pruning inside single region? Do you have any ideas how to implement
>> > it?
>> >
>> > --- Stack wrote: ---
>> >
>> > I think you need to make a custom splitter for your mapreduce job, one
>> > that makes splits that align with the ranges you'd have your job run
>> > over.   A permutation on HBASE-2302 might work for you.
>>
>
Oops. Sorry for the redundant info !



> >
>> > St.Ack
>> >
>> > On Wed, Mar 17, 2010 at 1:32 PM, Andrey Kolyadenko
>> > <[email protected]> wrote:
>> >> Hi all,
>> >>
>> >> maybe somebody could give me advice in the following situation:
>> >>
>> >> Currently HBase Scan interface provides ability to set up only
>> >> first and
>> >> last rows for MR scanning. Is it any way to get multiple ranges
>> >> into the map
>> >> input?
>> >>
>> >> For example let's assume I have following table:
>> >> key value
>> >> 1   v1
>> >> 2   v2
>> >> 3   v3
>> >> 4   v4
>> >> 5   v5
>> >>
>> >> What I need is to get for example [1,2) and [4,5) ranges as input
>> >> for my Map
>> >> task. Actually I need this for the performance optimization.
>> >>
>> >> Any advice?
>> >>
>> >> Thanks.
>> >
>> >
>> > _____________________________________________________________
>> > Sign up for your free SaturnFans email account at
>> http://webmail.saturnfans.com/
>>
>>
>>
>>
>> _____________________________________________________________
>> Sign up for your free SaturnFans email account at
>> http://webmail.saturnfans.com/
>>
>
>

Reply via email to