Re: split table data into two or more tables

Ted Yu Fri, 08 Feb 2013 11:23:13 -0800

In 0.94, there is optimization in StoreFileScanner.requestSeek() where a
real seek is only done when seekTimestamp > maxTimestampInFile.


I suggest upgrading to 0.94.4 so that you can utilize this facility.

On Fri, Feb 8, 2013 at 11:04 AM, Ted Yu <[email protected]> wrote:

> bq. in a cluster of 2 nodes +1 master
> I assume you're limited by hardware in the regard.
>
> bq. job selects these new records
> Have you used time-range scan ?
>
> Cheers
>
>
> On Fri, Feb 8, 2013 at 10:59 AM, <[email protected]> wrote:
>
>> Hi,
>>
>> The rationale is that I have a mapred job that adds new records to an
>> hbase table, constantly.
>> The next mapred job selects these new records, but it must iterate over
>> all records and check if it is a candidate for selection.
>> Since there are too many old records iterating though them in a cluster
>> of 2 nodes +1 master takes about 2 days. So I thought, splitting them into
>> two tables must reduce this time, and as soon as I figure out that there is
>> no more new record left in one of the new tables I will not run mapred job
>> on it.
>>
>> Currently, we have 7 regions including ROOT and META.
>>
>>
>> Thanks.
>> Alex.
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Ted Yu <[email protected]>
>> To: user <[email protected]>
>> Sent: Fri, Feb 8, 2013 10:40 am
>> Subject: Re: split table data into two or more tables
>>
>>
>> May I ask the rationale behind this ?
>> Were you aiming for higher write throughput ?
>>
>> Please also tell us how many regions you have in the current table.
>>
>> Thanks
>>
>> BTW please consider upgrading to 0.94.4
>>
>> On Fri, Feb 8, 2013 at 10:36 AM, <[email protected]> wrote:
>>
>> > Hello,
>> >
>> > I wondered if there is a way of splitting data from one table into two
>> or
>> > more tables in hbase with iidentical schemas, i.e. if table A has 100M
>> > records put 50M into table B, 50M into table C and delete table A.
>> > Currently, I use hbase-0.92.1 and hadoop-1.4.0
>> >
>> > Thanks.
>> > Alex.
>> >
>>
>>
>>
>

Re: split table data into two or more tables

Reply via email to