Harsh - I'd be inclined to think it's worse than just setting 
mapreduce.jobtracker.completeuserjobs.maximum - the only case this would solve 
is if a single user submitted 25 *large* jobs (in terms of tasks) over a single 
24-hr window.

David - I'm guessing you aren't using the CapacityScheduler - that would help 
you with more controls, limits on jobs etc.

More details here: 
http://hadoop.apache.org/common/docs/r1.0.3/capacity_scheduler.html

In particular, look at the example config there and let us know if you need 
help understanding any of it.

Arun

On Jun 9, 2012, at 10:40 PM, Harsh J wrote:

> Hey David,
> 
> Primarily you'd need to lower down
> "mapred.jobtracker.completeuserjobs.maximum" in your mapred-site.xml
> to a value of < 25. I recommend using 5, if you don't need much
> retention of job info per user. This will help keep the JT's live
> memory usage in check and stop your crashes instead of you having to
> raise your heap all the time. There's no "leak", but this config's
> default of 100 causes much issues to JT that runs a lot of jobs per
> day (from several users).
> 
> Try it out and let us know!
> 
> On Sat, Jun 9, 2012 at 12:37 AM, David Rosenstrauch <dar...@darose.net> wrote:
>> We're running 0.20.2 (Cloudera cdh3u4).
>> 
>> What configs are you referring to?
>> 
>> Thanks,
>> 
>> DR
>> 
>> 
>> On 06/08/2012 02:59 PM, Arun C Murthy wrote:
>>> 
>>> This shouldn't be happening at all...
>>> 
>>> What version of hadoop are you running? Potentially you need configs to
>>> protect the JT that you are missing, those should ensure your hadoop-1.x JT
>>> is very reliable.
>>> 
>>> Arun
>>> 
>>> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote:
>>> 
>>>> Our job tracker has been seizing up with Out of Memory (heap space)
>>>> errors for the past 2 nights.  After the first night's crash, I doubled the
>>>> heap space (from the default of 1GB) to 2GB before restarting the job.
>>>>  After last night's crash I doubled it again to 4GB.
>>>> 
>>>> This all seems a bit puzzling to me.  I wouldn't have thought that the
>>>> job tracker should require so much memory.  (The NameNode, yes, but not the
>>>> job tracker.)
>>>> 
>>>> Just wondering if this behavior sounds reasonable, or if perhaps there
>>>> might be a bigger problem at play here.  Anyone have any thoughts on the
>>>> matter?
>>>> 
>>>> Thanks,
>>>> 
>>>> DR
>>> 
>>> 
>>> --
>>> Arun C. Murthy
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>> 
>> 
> 
> 
> 
> -- 
> Harsh J

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/


Reply via email to