Re: split table data into two or more tables

alxsss Fri, 08 Feb 2013 17:48:15 -0800

Hi,

Thanks for suggestions. How a time range scan can be implemented in java code. 
Is there any sample code or tutorials?
Also, is it possible to select by a value of a column? Let say I know that 
records has family f and column m, and new records has m=5. I need to instruct 
hbase to send only these records to the mapper of mapred jobs.


Thanks.
Alex.

 

 

 

-----Original Message-----
From: Ted Yu <[email protected]>
To: user <[email protected]>
Sent: Fri, Feb 8, 2013 11:05 am
Subject: Re: split table data into two or more tables


bq. in a cluster of 2 nodes +1 master
I assume you're limited by hardware in the regard.

bq. job selects these new records
Have you used time-range scan ?

Cheers

On Fri, Feb 8, 2013 at 10:59 AM, <[email protected]> wrote:

> Hi,
>
> The rationale is that I have a mapred job that adds new records to an
> hbase table, constantly.
> The next mapred job selects these new records, but it must iterate over
> all records and check if it is a candidate for selection.
> Since there are too many old records iterating though them in a cluster of
> 2 nodes +1 master takes about 2 days. So I thought, splitting them into two
> tables must reduce this time, and as soon as I figure out that there is no
> more new record left in one of the new tables I will not run mapred job on
> it.
>
> Currently, we have 7 regions including ROOT and META.
>
>
> Thanks.
> Alex.
>
>
>
>
>
>
> -----Original Message-----
> From: Ted Yu <[email protected]>
> To: user <[email protected]>
> Sent: Fri, Feb 8, 2013 10:40 am
> Subject: Re: split table data into two or more tables
>
>
> May I ask the rationale behind this ?
> Were you aiming for higher write throughput ?
>
> Please also tell us how many regions you have in the current table.
>
> Thanks
>
> BTW please consider upgrading to 0.94.4
>
> On Fri, Feb 8, 2013 at 10:36 AM, <[email protected]> wrote:
>
> > Hello,
> >
> > I wondered if there is a way of splitting data from one table into two or
> > more tables in hbase with iidentical schemas, i.e. if table A has 100M
> > records put 50M into table B, 50M into table C and delete table A.
> > Currently, I use hbase-0.92.1 and hadoop-1.4.0
> >
> > Thanks.
> > Alex.
> >
>
>
>

Re: split table data into two or more tables

Reply via email to