The jobs would run in parallel since J1 doesn't use all of your map
tasks. Things get more interesting with reduce slots. If J1 is an
overall slower job, and you haven't configured
mapred.reduce.slowstart.completed.maps, then J1 could launch a bunch
of idle reduce tasks which would starve J2.

In general, it's best to configure the slow start property and to use
the fair scheduler or capacity scheduler.

-Joey

On Thu, Sep 22, 2011 at 6:05 AM, Praveen Sripati
<praveensrip...@gmail.com> wrote:
> Hi,
>
> Lets assume that there are two jobs J1 (100 map tasks) and J2 (200 map
> tasks) and the cluster has a capacity of 150 map tasks (15 nodes with 10 map
> tasks per node) and Hadoop is using the default FIFO scheduler. If I submit
> first J1 and then J2, will the jobs run in parallel or the job J1 has to be
> completed before the job J2 starts.
>
> I was reading 'Hadoop - The Definitive Guide'  and it says "Early versions
> of Hadoop had a very simple approach to scheduling users’ jobs: they ran in
> order of submission, using a FIFO scheduler. Typically, each job would use
> the whole cluster, so jobs had to wait their turn."
>
> Thanks,
> Praveen
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Reply via email to