Vijay Murthi wrote:
I am trying to understand what happens during the time duration when Map task
got finished and reduce task starts executing. I have 2 machines with 4 process
+ 4 Gigs on each with NFS (not dfs) to process 50 Gigs of data. Map taks finish
completion successfully. After that I see the following on the tasktracker log.
"Exception in thread "Server handler 1 on 50040" java.lang.OutOfMemoryError: Java
heap space"
Are you running the current trunk? My guess is that you are. If so,
then this error is "normal", things should keep running.
Are you running a 64-bit kernel? If not, can it really take advantage
of all 4GB? In my experience, 32-bit JVM's can't effectively use more
than around 1.5GB, and a 32-bit kernel can't effectively use all 4GB,
but I may be wrong on that last count.
Lister below is the configuration parameter. Am I setting JAVA memory heap very low
compared to io.sort.mb or file buffer size? I thought Tasktracker just pushes the job to
the child node, does it because of something like moving data ? If so is there a buffer
size I can set a limit? Also, I noticed on mapred local each under the directotries for
reduce files start growing even after tasktracker has "out of memory error".
Sorting does indeed happen in the child process.
4MB buffers for file streams seems large to me.
You might increase the io.sort.factor. With 500MB for sorting and a
sort factor of 100, each sort stream would get a 5MB buffer, plenty to
ensure that transfer time dominates seek, since the break-even point is
around 100kB. So you could even use a sort factor of 500. That would
make sorts a lot faster.
Also why are you setting the task timeout so high? Do you have mappers
or reducers that take a long time per entry and are not calling
Reporter.setStatus() regularly? That can cause tasks to time out.
Doug
-------------------------------------------------------------------
<name>io.sort.factor</name>
<value>10</value>
<name>io.sort.mb</name>
<value>500</value>
<name>io.skip.checksum.errors</name>
<value>false</value>
<name>io.file.buffer.size</name>
<value>4096000</value>
<name>mapred.reduce.tasks</name>
<value>6</value>
<name>mapred.task.timeout</name>
<value>100000000000</value>
<name>mapred.tasktracker.tasks.maximum</name>
<value>3</value>
<name>mapred.child.java.opts</name>
<value>-Xmx1024m</value>
<name>mapred.combine.buffer.size</name>
<value>100000</value>
<name>mapred.speculative.execution</name>
<value>true</value>
<name>ipc.client.timeout</name>
<value>60000</value>
------------------------------------------------------------
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=1024
------------------------------------------------------------