I wasn't clear in my previous email.
It was not answer to why map tasks got stuck.
TableInputFormatBase.getSplits() is being called already.

Can you try getting jstack of one of the map tasks before task tracker kills
it ?

Thanks

On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter <[email protected]> wrote:

> 1. Currently every map gets one region. So I don't understand what
> difference will it make using the splits.
> 2. How should I use the TableInputFormatBase.getSplits() ? Could not find
> examples for that.
>
> Thanks,
> Lior
>
>
> On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu <[email protected]> wrote:
>
> > For #2, see TableInputFormatBase.getSplits():
> >   * Calculates the splits that will serve as input for the map tasks. The
> >   * number of splits matches the number of regions in a table.
> >
> >
> > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <[email protected]>
> > wrote:
> >
> > > 1. yes - I configure my job using this line:
> > > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
> scan,
> > > ScanMapper.class, Text.class, MapWritable.class, job)
> > >
> > > which internally uses TableInputFormat.class
> > >
> > > 2. One split per region ? What do you mean ? How do I do that ?
> > >
> > > 3. hbase version 0.90.2
> > >
> > > 4. no exceptions. the logs are very clean.
> > >
> > >
> > >
> > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu <[email protected]> wrote:
> > >
> > > > Do you use TableInputFormat ?
> > > > To scan large number of rows, it would be better to produce one Split
> > per
> > > > region.
> > > >
> > > > What HBase version do you use ?
> > > > Do you find any exception in master / region server logs around the
> > > moment
> > > > of timeout ?
> > > >
> > > > Cheers
> > > >
> > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter <[email protected]>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > > I'm running a scan using the M/R framework.
> > > > > My table contains hundreds of millions of rows and I'm scanning
> using
> > > > > start/stop key about 50 million rows.
> > > > >
> > > > > The problem is that some map tasks get stuck and the task manager
> > kills
> > > > > these maps after 600 seconds. When retrying the task everything
> works
> > > > fine
> > > > > (sometimes).
> > > > >
> > > > > To verify that the problem is in hbase (and not in the map code) I
> > > > removed
> > > > > all the code from my map function, so it looks like this:
> > > > > public void map(ImmutableBytesWritable key, Result value, Context
> > > > context)
> > > > > throws IOException, InterruptedException {
> > > > > }
> > > > >
> > > > > Also, when the map got stuck on a region, I tried to scan this
> region
> > > > > (using
> > > > > simple scan from a Java main) and it worked fine.
> > > > >
> > > > > Any ideas ?
> > > > >
> > > > > Thanks,
> > > > > Lior
> > > > >
> > > >
> > >
> >
>

Reply via email to