[
https://issues.apache.org/jira/browse/MAPREDUCE-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14524557#comment-14524557
]
Hadoop QA commented on MAPREDUCE-4506:
--------------------------------------
\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch | 0m 0s | The patch command could not apply
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL |
http://issues.apache.org/jira/secure/attachment/12538889/ReduceTask.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output |
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5494/console |
This message was automatically generated.
> EofException / 'connection reset by peer' while copying map output
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-4506
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4506
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 1.0.3
> Environment: Ubuntu Linux 12.04 LTS, 64-bit, Java 6 update 33
> Reporter: Piotr Kołaczkowski
> Priority: Minor
> Attachments: RamManager.patch, ReduceTask.patch
>
>
> When running complex mapreduce jobs with many mappers and reducers (e.g. 8
> mappers, 8 reducers on a 8 core machine), sometimes the following exceptions
> pop up in the logs during the shuffle phase:
> {noformat}
> WARN [570516323@qtp-2060060479-164] 2012-07-19 02:50:21,229 TaskTracker.java
> (line 3894) getMapOutput(attempt_201207161621_0217_m_000071_0,0) failed :
> org.mortbay.jetty.EofException
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
> at
> org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:568)
> at
> org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1005)
> at
> org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:648)
> at
> org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:579)
> at
> org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3872)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
> at
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
> at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
> at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> at org.mortbay.jetty.Server.handle(Server.java:326)
> at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
> at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcher.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
> at sun.nio.ch.IOUtil.write(IOUtil.java:43)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
> at
> org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
> at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
> {noformat}
> The problem looks like some network problems at first, however it turns out
> that hadoop shuffleInMemory sometimes deliberately closes map-output-copy
> connections just to reopen them a few milliseconds later, because of
> temporary unavailability of free memory. Because the sending side does not
> expect this, an exception is thrown. Additionally this leads to wasting
> resources on the sender side, which does more work than required serving
> additional requests.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)