https://issues.apache.org/jira/browse/HBASE-2434 has been logged.
On Sun, Apr 11, 2010 at 7:09 AM, Jean-Daniel Cryans <jdcry...@apache.org>wrote: > Yes an option could be added, along with a write buffer option for Import. > > J-D > > On Sun, Apr 11, 2010 at 3:30 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > I noticed mapreduce.Export.createSubmittableJob() doesn't call > setCaching() > > in 0.20.3 > > > > Should call to setCaching() be added ? > > > > Thanks > > > > On Sun, Apr 11, 2010 at 2:14 AM, Jean-Daniel Cryans <jdcry...@apache.org > >wrote: > > > >> A map against a HBase table by default cannot have more tasks than the > >> number of regions in that table. > >> > >> Also you want to enable scanner caching. Pass a Scan object to the > >> TableMapReduceUtil.initTableMapperJob that is configured with > >> scan.setCaching(some_value) where the value should be the number of > >> rows to fetch every time we hit a region server with next(). On rows > >> of 100-200 bytes, our jobs usually are configured with 1000 up to > >> 10000. > >> > >> Finally, is your job running in local mode or on a job tracker? Even > >> if HBase uses HDFS, it usually doesn't know of the job tracker unless > >> you configure HBase's classpath with Hadoop's conf. > >> > >> J-D > >> > >> On Sun, Apr 11, 2010 at 3:17 AM, Andriy Kolyadenko > >> <cryp...@mail.saturnfans.com> wrote: > >> > Hi, > >> > > >> > thanks for quick response. I tried to do following in the code: > >> > > >> > job.getConfiguration().setInt("mapred.map.tasks", 10000); > >> > > >> > but unfortunately have the same result. > >> > > >> > Any other ideas? > >> > > >> > --- ama...@gmail.com wrote: > >> > > >> > From: Amandeep Khurana <ama...@gmail.com> > >> > To: hbase-user@hadoop.apache.org, cryp...@mail.saturnfans.com > >> > Subject: Re: set number of map tasks for HBase MR > >> > Date: Sat, 10 Apr 2010 20:04:18 -0700 > >> > > >> > You can set the number of map tasks in your job config to a big number > >> (eg: > >> > 100000), and the library will automatically spawn one map task per > >> region. > >> > > >> > -ak > >> > > >> > > >> > Amandeep Khurana > >> > Computer Science Graduate Student > >> > University of California, Santa Cruz > >> > > >> > > >> > On Sat, Apr 10, 2010 at 7:59 PM, Andriy Kolyadenko < > >> > cryp...@mail.saturnfans.com> wrote: > >> > > >> >> Hi guys, > >> >> > >> >> I have about 8G Hbase table and I want to run MR job against it. It > >> works > >> >> extremely slow in my case. One thing I noticed is that job runs only > 2 > >> map > >> >> tasks. Is it any way to setup bigger number of map tasks? I sow some > >> method > >> >> in mapred package, but can't find anything like this in new mapreduce > >> >> package. > >> >> > >> >> I run my MR job one a single machine in cluster mode. > >> >> > >> >> > >> >> _____________________________________________________________ > >> >> Sign up for your free SaturnFans email account at > >> >> http://webmail.saturnfans.com/ > >> >> > >> > > >> > > >> > > >> > > >> > _____________________________________________________________ > >> > Sign up for your free SaturnFans email account at > >> http://webmail.saturnfans.com/ > >> > > >> > > >