Hi J-D, I looked at the job.xml of the mapreduce job and I see the scanner caching is set to the correct value.
You mean the bandwidth between cluster? Yes. It should have big enough bandwidth between them. The 2 clusters are in the same "cluster". I just created 2 hbase/hadoop clusters. How do I pre-split? Can I do that using hbase shell? Thanks, Harold --- On Mon, 5/2/11, Jean-Daniel Cryans <[email protected]> wrote: > From: Jean-Daniel Cryans <[email protected]> > Subject: Re: CopyTable job is really slow > To: [email protected] > Date: Monday, May 2, 2011, 8:57 PM > What about the size of the pipe > between the clusters? > > Did you pre-split? If you copy again, is it faster? > > How did you check that scanner caching was picked up? > > Thx, > > J-D > > On Mon, May 2, 2011 at 5:49 PM, Harold Lim <[email protected]> > wrote: > > Hi J-D, > > > > I have around 14GB of data to copy. It's a table > containing 20 million rows with 5 columns. One of the column > has ~20 versions. The table has 64 regions. I have a cluster > size of 10 copying to another cluster size of 10, with 2 map > slots per node. It took 38 minutes to finish copying the > table. > > > > I checked and the scanner caching is already picked > up. > > > > > > > > Thanks, > > Harold > > > > > > --- On Mon, 5/2/11, Jean-Daniel Cryans <[email protected]> > wrote: > > > >> From: Jean-Daniel Cryans <[email protected]> > >> Subject: Re: CopyTable job is really slow > >> To: [email protected] > >> Date: Monday, May 2, 2011, 8:33 PM > >> That's a very vague question... > >> what's "slow" exactly? How much data > >> do you have to copy? What transfer rate are you > expecting? > >> What's the > >> hardware like? How big is the pipe between the > two > >> clusters? > >> > >> The best I could tell you would be to make sure > that the > >> target table > >> is already pre-split, that you are using enough > mappers on > >> the source > >> cluster and that the scanner caching is really > picked up by > >> the > >> mapreduce job (meaning that hbase-site.xml is in > Hadoop's > >> classpath. > >> > >> J-D > >> > >> On Mon, May 2, 2011 at 5:24 PM, Harold Lim <[email protected]> > >> wrote: > >> > Hi, > >> > > >> > I'm trying to copy a table from one hbase > cluster to > >> another hbase cluster using the CopyTable job. It > seems to > >> be very slow. Any tips to improve the performance? > Or is it > >> really like that? > >> > > >> > I have already set > hbase.client.scanner.caching = > >> 500. > >> > > >> > > >> > Thanks, > >> > Harold > >> > > >> > > >
