Hey David, Primarily you'd need to lower down "mapred.jobtracker.completeuserjobs.maximum" in your mapred-site.xml to a value of < 25. I recommend using 5, if you don't need much retention of job info per user. This will help keep the JT's live memory usage in check and stop your crashes instead of you having to raise your heap all the time. There's no "leak", but this config's default of 100 causes much issues to JT that runs a lot of jobs per day (from several users).
Try it out and let us know! On Sat, Jun 9, 2012 at 12:37 AM, David Rosenstrauch <dar...@darose.net> wrote: > We're running 0.20.2 (Cloudera cdh3u4). > > What configs are you referring to? > > Thanks, > > DR > > > On 06/08/2012 02:59 PM, Arun C Murthy wrote: >> >> This shouldn't be happening at all... >> >> What version of hadoop are you running? Potentially you need configs to >> protect the JT that you are missing, those should ensure your hadoop-1.x JT >> is very reliable. >> >> Arun >> >> On Jun 8, 2012, at 8:26 AM, David Rosenstrauch wrote: >> >>> Our job tracker has been seizing up with Out of Memory (heap space) >>> errors for the past 2 nights. After the first night's crash, I doubled the >>> heap space (from the default of 1GB) to 2GB before restarting the job. >>> After last night's crash I doubled it again to 4GB. >>> >>> This all seems a bit puzzling to me. I wouldn't have thought that the >>> job tracker should require so much memory. (The NameNode, yes, but not the >>> job tracker.) >>> >>> Just wondering if this behavior sounds reasonable, or if perhaps there >>> might be a bigger problem at play here. Anyone have any thoughts on the >>> matter? >>> >>> Thanks, >>> >>> DR >> >> >> -- >> Arun C. Murthy >> Hortonworks Inc. >> http://hortonworks.com/ >> >> >> > > -- Harsh J