Re: Out of memory after Map tasks

Doug Cutting Thu, 25 May 2006 12:09:48 -0700

Vijay Murthi wrote:

Are you running the current trunk?  My guess is that you are.  If so,
then this error is "normal", things should keep running.


I am using hadoop-0.2.0. I believe this is the current trunk.


No, that's a release.  The trunk is what's currently in Subversion.

I used to
think child task exit with "Out of memory" is normal since the job can
be re-executed on another machine that finish whereas Tasktracker which
manages should not. After this message I see only one Tasktracker
running on each node with "99%" on CPU all the time not any reduce task.


It sounds like these "Out of memory" errors are fatal.

On "mapred" local directory I see it writing to directory of name
"*_r_*". Since every output map task produce is on local disk can't it
just read those reduce files Map task create?

The local map output files are mostly needed by reduces running on othernodes and must first be transferred.

I am running on 64 bit kernel with JVM set to 32-bit. The JAVA heap size
set to a maximum of 1 GB for both Tasktracker and child process. I
believe Tasktracker and each child process runs on its own JVM of 1 GB
(correct, me if I am wrong).  Does each child process should have less
memory than Tasktracker or total of memory of child process it manage
should be less than Tasktracker memory heap since Tasktracker creates
the children? In my case, I am setting 500 MB of sort memory for each
child reduce process. So 3 reduce task * 500 MB can be more than 1 GB
and causes "Out of memory"?

Why are you using 500MB of sort memory with a 1GB heap if it keepscausing problems? I would suggest either decreasing the sort memory orincreasing the heap size. Better yet, start with the defaults andchange one parameter at a time.

4MB buffers for file streams seems large to me.


I keep 4 MB buffer because each Map task reading around 2 GB Gzip text
file. I thought this will make the reading process efficient and 4 MB *
3 map task per node is like 12 MB. Not sure, why this is lot.

Again, changing one setting at a time will allow you to better figureout what improves things and what causes problems. This parameter isused for lots of files, more than just your input data, so increasing itto 4MB causes lots of 4MB buffers to be created. I have a hard timeseeing a justification ever for buffers larger than 1MB, as even 100kshould usually cause transfer to dominate seek, but, since map andreduce both operate sequentially, even 100k should not be required forgood performance.

So you could even use a sort factor of 500.  That would
make sorts a lot faster.


Ok I will try that. I have around 120 reduce files in total each around

1 GB for 6 reduce process.

Please first try things with the defaults. Then try increasing the sortfactor to find if that improves things for you.

Also why are you setting the task timeout so high?  Do you have


mappers

or reducers that take a long time per entry and are not calling
Reporter.setStatus() regularly?  That can cause tasks to time out.


Yes. Map task sometime take a long time and got killed. I have a
reporter that set status when record reader is created. Still things get
printed on the web page only after the task exit with Succeed or Failure

status.

If processing a single record could take longer than the task timeout(10 minutes) then you should call setStatus() during the processing ofthe record to avoid timeouts. That's a better way to fix this than toincrease the task timeout. Note that setStatus() is efficient: don'tworry about calling it too often.


Doug

Re: Out of memory after Map tasks

Reply via email to