Re: How to get specified rows and avoid full table scanning?

Ted Yu Mon, 21 Apr 2014 08:57:18 -0700

There're several alternatives.
One of which is HBaseWD :
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/


You can also take a look at Phoenix.

Cheers


On Mon, Apr 21, 2014 at 8:04 AM, Tao Xiao <[email protected]> wrote:

> I have a big table and rows will be added to this table each day. I wanna
> run a MapReduce job over this table and select rows of several days as the
> job's input data. How can I achieve this?
>
> If I prefix the rowkey with the date, I can easily select one day's data as
> the job's input, but this will involve hot spot problem because hundreds of
> millions of rows will be added to this table each day and the data will
> probably go to a single region server. Secondary index would be good for
> query but not good for a batch processing job.
>
> Are there any other ways?
>
> Are there any other frameworks which can achieve this goal easieruser?
> Shark? Stinger？HSearch?
>

Re: How to get specified rows and avoid full table scanning?

Reply via email to