What about the size of the pipe between the clusters? Did you pre-split? If you copy again, is it faster?
How did you check that scanner caching was picked up? Thx, J-D On Mon, May 2, 2011 at 5:49 PM, Harold Lim <[email protected]> wrote: > Hi J-D, > > I have around 14GB of data to copy. It's a table containing 20 million rows > with 5 columns. One of the column has ~20 versions. The table has 64 regions. > I have a cluster size of 10 copying to another cluster size of 10, with 2 map > slots per node. It took 38 minutes to finish copying the table. > > I checked and the scanner caching is already picked up. > > > > Thanks, > Harold > > > --- On Mon, 5/2/11, Jean-Daniel Cryans <[email protected]> wrote: > >> From: Jean-Daniel Cryans <[email protected]> >> Subject: Re: CopyTable job is really slow >> To: [email protected] >> Date: Monday, May 2, 2011, 8:33 PM >> That's a very vague question... >> what's "slow" exactly? How much data >> do you have to copy? What transfer rate are you expecting? >> What's the >> hardware like? How big is the pipe between the two >> clusters? >> >> The best I could tell you would be to make sure that the >> target table >> is already pre-split, that you are using enough mappers on >> the source >> cluster and that the scanner caching is really picked up by >> the >> mapreduce job (meaning that hbase-site.xml is in Hadoop's >> classpath. >> >> J-D >> >> On Mon, May 2, 2011 at 5:24 PM, Harold Lim <[email protected]> >> wrote: >> > Hi, >> > >> > I'm trying to copy a table from one hbase cluster to >> another hbase cluster using the CopyTable job. It seems to >> be very slow. Any tips to improve the performance? Or is it >> really like that? >> > >> > I have already set hbase.client.scanner.caching = >> 500. >> > >> > >> > Thanks, >> > Harold >> > >> >
