Re: CopyTable job is really slow

Jean-Daniel Cryans Mon, 02 May 2011 17:57:47 -0700

What about the size of the pipe between the clusters?

Did you pre-split? If you copy again, is it faster?


How did you check that scanner caching was picked up?

Thx,

J-D

On Mon, May 2, 2011 at 5:49 PM, Harold Lim <[email protected]> wrote:
> Hi J-D,
>
> I have around 14GB of data to copy. It's a table containing 20 million rows 
> with 5 columns. One of the column has ~20 versions. The table has 64 regions. 
> I have a cluster size of 10 copying to another cluster size of 10, with 2 map 
> slots per node. It took 38 minutes to finish copying the table.
>
> I checked and the scanner caching is already picked up.
>
>
>
> Thanks,
> Harold
>
>
> --- On Mon, 5/2/11, Jean-Daniel Cryans <[email protected]> wrote:
>
>> From: Jean-Daniel Cryans <[email protected]>
>> Subject: Re: CopyTable job is really slow
>> To: [email protected]
>> Date: Monday, May 2, 2011, 8:33 PM
>> That's a very vague question...
>> what's "slow" exactly? How much data
>> do you have to copy? What transfer rate are you expecting?
>> What's the
>> hardware like? How big is the pipe between the two
>> clusters?
>>
>> The best I could tell you would be to make sure that the
>> target table
>> is already pre-split, that you are using enough mappers on
>> the source
>> cluster and that the scanner caching is really picked up by
>> the
>> mapreduce job (meaning that hbase-site.xml is in Hadoop's
>> classpath.
>>
>> J-D
>>
>> On Mon, May 2, 2011 at 5:24 PM, Harold Lim <[email protected]>
>> wrote:
>> > Hi,
>> >
>> > I'm trying to copy a table from one hbase cluster to
>> another hbase cluster using the CopyTable job. It seems to
>> be very slow. Any tips to improve the performance? Or is it
>> really like that?
>> >
>> > I have already set hbase.client.scanner.caching =
>> 500.
>> >
>> >
>> > Thanks,
>> > Harold
>> >
>>
>

Re: CopyTable job is really slow

Reply via email to