Re: How to get specified rows and avoid full table scanning?

Jean-Marc Spaggiari Mon, 21 Apr 2014 09:22:11 -0700

Hi Tao,

also, if you are thinking about time series, you can take a look at TSBD
http://opentsdb.net/


JM


2014-04-21 11:56 GMT-04:00 Ted Yu <[email protected]>:

> There're several alternatives.
> One of which is HBaseWD :
>
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>
> You can also take a look at Phoenix.
>
> Cheers
>
>
> On Mon, Apr 21, 2014 at 8:04 AM, Tao Xiao <[email protected]>
> wrote:
>
> > I have a big table and rows will be added to this table each day. I wanna
> > run a MapReduce job over this table and select rows of several days as
> the
> > job's input data. How can I achieve this?
> >
> > If I prefix the rowkey with the date, I can easily select one day's data
> as
> > the job's input, but this will involve hot spot problem because hundreds
> of
> > millions of rows will be added to this table each day and the data will
> > probably go to a single region server. Secondary index would be good for
> > query but not good for a batch processing job.
> >
> > Are there any other ways?
> >
> > Are there any other frameworks which can achieve this goal easieruser?
> > Shark? Stinger？HSearch?
> >
>

Re: How to get specified rows and avoid full table scanning?

Reply via email to