Hi Tim,
       i made a class which extends table input format base and set Htable ,
InputColumns and the row filter. but i don't know how to set that class as
an input to my map reduce program.
Currently i am using TableMapReduceUtil to set my tablename and
columnFamilies to set the input to my map class.

On Tue, Apr 7, 2009 at 3:11 PM, tim robertson <[email protected]>wrote:

> I am a newbie, but...
>
> I think it will boil down to something looking at the column and
> applying the filter.  I don't think without reworking the model or
> adding some kind of index you would get around this.
>
> Why not set a RowFilter to the TableInputFormat and then it is
> filtered before your map - I presume this would be more efficient than
> shuffling all the data through the task tracking of Hadoop MR.
>
> Cheers
>
> Tim
>
>
>
> On Tue, Apr 7, 2009 at 11:26 AM, Rakhi Khatwani
> <[email protected]> wrote:
> > Hi,
> >     i have a map reduce program with which i read from a hbase table.
> > In my map program i check if the column value of a is xxx, if yes then
> > continue with processing else skip it.
> > however if my table is really big, most of my time in the map gets wasted
> > for processing unwanted rows.
> > is there any way through which we could send a subset of rows (based on
> the
> > value of a particular column family) to the map???
> >
> > i have also gone through TableInputFormatBase but am not able to figure
> out
> > how do we set the input format if we are using TableMapReduceUtil class
> to
> > initialize table map jobs. or is there any other way i could use it.
> >
> > Thanks in Advance,
> > Raakhi.
> >
>

Reply via email to