Re: Can't achieve load distribution

Praveen Sripati Thu, 02 Feb 2012 05:39:32 -0800

> I have a simple MR job, and I want each Mapper to get one line from my
input file (which contains further instructions for lengthy processing).


Use the NLineInputFormat class.

http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/lib/input/NLineInputFormat.html

Praveen

On Thu, Feb 2, 2012 at 9:43 AM, Mark Kerzner <[email protected]>wrote:

> Thanks!
> Mark
>
> On Wed, Feb 1, 2012 at 7:44 PM, Anil Gupta <[email protected]> wrote:
>
> > Yes, if ur block size is 64mb. Btw, block size is configurable in Hadoop.
> >
> > Best Regards,
> > Anil
> >
> > On Feb 1, 2012, at 5:06 PM, Mark Kerzner <[email protected]>
> wrote:
> >
> > > Anil,
> > >
> > > do you mean one block of HDFS, like 64MB?
> > >
> > > Mark
> > >
> > > On Wed, Feb 1, 2012 at 7:03 PM, Anil Gupta <[email protected]>
> > wrote:
> > >
> > >> Do u have enough data to start more than one mapper?
> > >> If entire data is less than a block size then only 1 mapper will run.
> > >>
> > >> Best Regards,
> > >> Anil
> > >>
> > >> On Feb 1, 2012, at 4:21 PM, Mark Kerzner <[email protected]>
> > wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I have a simple MR job, and I want each Mapper to get one line from
> my
> > >>> input file (which contains further instructions for lengthy
> > processing).
> > >>> Each line is 100 characters long, and I tell Hadoop to read only 100
> > >> bytes,
> > >>>
> > >>>
> > >>
> >
> job.getConfiguration().setInt("mapreduce.input.linerecordreader.line.maxlength",
> > >>> 100);
> > >>>
> > >>> I see that this part works - it reads only one line at a time, and
> if I
> > >>> change this parameter, it listens.
> > >>>
> > >>> However, on a cluster only one node receives all the map tasks. Only
> > one
> > >>> map tasks is started. The others never get anything, they just wait.
> > I've
> > >>> added 100 seconds wait to the mapper - no change!
> > >>>
> > >>> Any advice?
> > >>>
> > >>> Thank you. Sincerely,
> > >>> Mark
> > >>
> >
>

Re: Can't achieve load distribution

Reply via email to