Disk thrashing / task timeouts during map output copy phase
-----------------------------------------------------------
Key: HADOOP-141
URL: http://issues.apache.org/jira/browse/HADOOP-141
Project: Hadoop
Type: Bug
Components: mapred
Environment: linux
Reporter: paul sutter
MapOutputProtocol connections cause timeouts because of system thrashing and
transferring the same file over and over again, ultimately leading to making no
forward progress(medium sized job, 500GB input file, map output about as large
as the input, 10 node cluster).
There are several bugs behind this, but the following two changes improved
matters considerably.
(1)
The buffersize in MapOutputFile is currently hardcoded to 8192 bytes (for both
reads and writes). By changing this buffer size to 256KB, the number of disk
seeks are reduced and the problem went away.
Ideally there would be a buffer size parameter for this that is separate from
the DFS io buffer size.
(2)
I also added the following code to the socket configuration in both Server.java
and Client.java. No linger is a minor good idea in an enivronment with some
packet loss (and you will have that when all the nodes get busy at once), but
256KB buffers is probably excessive, especially on a LAN, but it takes me two
hours to test changes so I havent experimented.
socket.setSendBufferSize(256*1024);
socket.setReceiveBufferSize(256*1024);
socket.setSoLinger(false, 0);
socket.setKeepAlive(true);
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira