My task is following: I have the list of key ranges and I need to perform MR for this ranges as fast as possible. As far as I understand MR will do full scan if I will use filter. Is it correct?
--- [email protected] wrote: From: Stack <[email protected]> To: "[email protected]" <[email protected]> Subject: Re: Multi ranges Scan Date: Thu, 25 Mar 2010 19:57:44 -0700 Can you use a filter to do this? If no pattern to the excludes then it's tougher. How do you know what to exclude? It's in a repository somewhere? Add a filter to query this repo? On Mar 25, 2010, at 4:07 PM, "Andriy Kolyadenko" <[email protected] > wrote: > Ok, it would work for regions pruning. And what about actual rows > pruning inside single region? Do you have any ideas how to implement > it? > > --- Stack wrote: --- > > I think you need to make a custom splitter for your mapreduce job, one > that makes splits that align with the ranges you'd have your job run > over. A permutation on HBASE-2302 might work for you. > > St.Ack > > On Wed, Mar 17, 2010 at 1:32 PM, Andrey Kolyadenko > <[email protected]> wrote: >> Hi all, >> >> maybe somebody could give me advice in the following situation: >> >> Currently HBase Scan interface provides ability to set up only >> first and >> last rows for MR scanning. Is it any way to get multiple ranges >> into the map >> input? >> >> For example let's assume I have following table: >> key value >> 1 v1 >> 2 v2 >> 3 v3 >> 4 v4 >> 5 v5 >> >> What I need is to get for example [1,2) and [4,5) ranges as input >> for my Map >> task. Actually I need this for the performance optimization. >> >> Any advice? >> >> Thanks. > > > _____________________________________________________________ > Sign up for your free SaturnFans email account at > http://webmail.saturnfans.com/ _____________________________________________________________ Sign up for your free SaturnFans email account at http://webmail.saturnfans.com/
