Yes, you're correct that the end-result is not going to be as static
as you expect it to be. FWIW, per node limit configs have been
discussed before (and even implemented + removed):
https://issues.apache.org/jira/browse/HADOOP-5170

On Fri, May 24, 2013 at 1:47 PM, Steve Lewis <lordjoe2...@gmail.com> wrote:
> My reading on Capacity Scheduling is that it controls the number of jobs
> scheduled at the level of the cluster.
> My issue is not sharing at the level of the cluster - usually my job is the
> only one running but rather at the level of
> the individual machine.
>   Some of my jobs require more memory and do significant processing -
> especially in the reducer - While the cluster can schedule 8 smaller jobs
> on a node when, say, 8  of the larger ones are scheduled slaves run out of
> swap space and tend to crash.
>   It is not unclear that limiting the number of jobs on the cluster will
> stop a scheduler from scheduling the maximum allowed jobs on any node.
>   Even requesting multiple slots for a job affects the number of jobs
> running on the cluster but not on any specific node.
>   Am I wrong here? If I want, say only three of my jobs running on one node
> does asking for enough slots to guarantee the total jobs is no more than 3
> times the number of nodes guarantee this?
>    My read is that the total running jobs might be throttled but not the
> number per node.
>   Perhaps a clever use of queues might help but I am not quite sure about
> the details
>
>
> On Thu, May 23, 2013 at 4:37 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Your problem seems to surround available memory and over-subscription. If
>> you're using a 0.20.x or 1.x version of Apache Hadoop, you probably want to
>> use the CapacityScheduler to address this for you.
>>
>> I once detailed how-to, on a similar question here:
>> http://search-hadoop.com/m/gnFs91yIg1e
>>
>>
>> On Wed, May 22, 2013 at 2:55 PM, Steve Lewis <lordjoe2...@gmail.com>
>> wrote:
>>
>> > I have a series of Hadoop jobs to run - one of my jobs requires larger
>> than
>> > standard memory
>> > I allow the task to use 2GB of memory. When I run some of these jobs the
>> > slave nodes are crashing because they run out of swap space. It is not
>> that
>> > s slave count not run one. or even 4  of these jobs but 8 stresses the
>> > limits.
>> >  I could cut the mapred.tasktracker.reduce.tasks.maximum for the entire
>> > cluster but this cripples the whole cluster for one of many jobs.
>> > It seems to be a very bad design
>> > a) to allow the job tracker to keep assigning tasks to a slave that is
>> > already getting low on memory
>> > b) to allow the user to run jobs capable or crashing noeds on the cluster
>> > c) not to allow the user to specify that some jobs need to be limited to
>> a
>> > lower value without requiring this limit for every job.
>> >
>> > Are there plans to fix this??
>> >
>> > --
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com



--
Harsh J

Reply via email to