Disk thrashing / task timeouts during map output copy phase
-----------------------------------------------------------

         Key: HADOOP-141
         URL: http://issues.apache.org/jira/browse/HADOOP-141
     Project: Hadoop
        Type: Bug

  Components: mapred  
 Environment: linux
    Reporter: paul sutter



MapOutputProtocol connections cause timeouts because of system thrashing and 
transferring the same file over and over again, ultimately leading to making no 
forward progress(medium sized job, 500GB input file, map output about as large 
as the input, 10 node cluster).

There are several bugs behind this, but the following two changes improved 
matters considerably.

(1) 

The buffersize in MapOutputFile is currently hardcoded to 8192 bytes (for both 
reads and writes). By changing this buffer size to 256KB, the number of disk 
seeks are reduced and the problem went away. 

Ideally there would be a buffer size parameter for this that is separate from 
the DFS io buffer size.

(2)

I also added the following code to the socket configuration in both Server.java 
and Client.java. No linger is a minor good idea in an enivronment with some 
packet loss (and you will have that when all the nodes get busy at once), but 
256KB buffers is probably excessive, especially on a LAN, but it takes me two 
hours to test changes so I havent experimented.

socket.setSendBufferSize(256*1024);
socket.setReceiveBufferSize(256*1024);
socket.setSoLinger(false, 0);
socket.setKeepAlive(true);


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to