RE: Question on running simultaneous jobs

Xavier Stevens Wed, 09 Jan 2008 14:58:47 -0800

This doesn't work to solve this issue because it sets the total number
of map/reduce tasks. When setting the total number of map tasks I get an
ArrayOutOfBoundsException within Hadoop; I believe because of the input
dataset size (around 90 million lines).

I think it is important to make a distinction between setting total
number of map/reduce tasks and the number that can run(per job) at any
given time.  I would like only to restrict the later, while allowing
Hadoop to divide the data into chunks as it sees fit.

-----Original Message-----
From: Ted Dunning [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 09, 2008 1:50 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Question on running simultaneous jobs

You may need to upgrade, but 15.1 does just fine with multiple jobs in
the cluster.  Use conf.setNumMapTasks(int) and
conf.setNumReduceTasks(int).

On 1/9/08 11:25 AM, "Xavier Stevens" <[EMAIL PROTECTED]> wrote:

> Does Hadoop support running simultaneous jobs?  If so, what parameters

> do I need to set in my job configuration?  We basically want to give a

> job that takes a really long time, half of the total resources of the 
> cluster so other jobs don't queue up behind it.
> 
> I am using Hadoop 0.14.2 currently.  I tried setting 
> mapred.tasktracker.tasks.maximum to be half of the maximum specified 
> in mapred-default.xml.  This shows the change in the web 
> administration page for the job, but it has no effect on the actual 
> numbers of tasks running.
> 
> Thanks,
> 
> Xavier
>

RE: Question on running simultaneous jobs

Reply via email to