This doesn't work to solve this issue because it sets the total number of map/reduce tasks. When setting the total number of map tasks I get an ArrayOutOfBoundsException within Hadoop; I believe because of the input dataset size (around 90 million lines).
I think it is important to make a distinction between setting total number of map/reduce tasks and the number that can run(per job) at any given time. I would like only to restrict the later, while allowing Hadoop to divide the data into chunks as it sees fit. -----Original Message----- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 09, 2008 1:50 PM To: hadoop-user@lucene.apache.org Subject: Re: Question on running simultaneous jobs You may need to upgrade, but 15.1 does just fine with multiple jobs in the cluster. Use conf.setNumMapTasks(int) and conf.setNumReduceTasks(int). On 1/9/08 11:25 AM, "Xavier Stevens" <[EMAIL PROTECTED]> wrote: > Does Hadoop support running simultaneous jobs? If so, what parameters > do I need to set in my job configuration? We basically want to give a > job that takes a really long time, half of the total resources of the > cluster so other jobs don't queue up behind it. > > I am using Hadoop 0.14.2 currently. I tried setting > mapred.tasktracker.tasks.maximum to be half of the maximum specified > in mapred-default.xml. This shows the change in the web > administration page for the job, but it has no effect on the actual > numbers of tasks running. > > Thanks, > > Xavier >