[ 
https://issues.apache.org/jira/browse/HADOOP-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485611
 ] 

Tom White commented on HADOOP-1159:
-----------------------------------

>We saw the NPE coming from the call to mapOutputIn.read( ) in the 
>MapOutputServlet.doGet method in TaskTracker.java. Hairong said
>that HADOOP-1123 should fix the NPE problem in the read method, but am not 
>sure since it is not possible to consistently reproduce
>this problem.

I'm inclined to not apply this patch, but instead see if HADOOP-1123 fixes the 
problem. If it doesn't then hopefully we can get another case that better 
characterizes the problem, and allows us to produce a better fix. Unless 
there's strong objection to this I would mark it as open and not fixed in 
0.12.3.

(I also wonder if the logging is hindering the diagnosis of this bug, since 
there is no stacktrace for the NPE.)

> Reducers hang when map output file has a checksum error
> -------------------------------------------------------
>
>                 Key: HADOOP-1159
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1159
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.2
>            Reporter: Nigel Daley
>         Assigned To: Owen O'Malley
>             Fix For: 0.12.3
>
>         Attachments: 1159-merge.patch, 1159.patch, h1159-2.patch, h1159.patch
>
>
> Two reduces hung in our sort benchmark. They always fail to get map outputs 
> from node X due to checksum error when the map outputs are read at that node 
> resulting in a NullPointerException on node X. This leads to constant 
> failures on the two fetching reduces.
> 2007-03-26 00:02:57,082 WARN org.apache.hadoop.fs.FileSystem: Moving bad file 
> /e/c/k/hqa/tb/tmp/mapred/local2/task_0002_m_022488_0/file.out to 
> /e/c/bad_files/file.out.542279301
> 2007-03-26 00:02:57,083 INFO org.apache.hadoop.fs.FSInputChecker: Found 
> checksum error: org.apache.hadoop.fs.ChecksumException: Checksum error: 
> /e/c/k/hqa/tb/tmp/mapred/local2/task_0002_m_022488_0/file.out at 106484224
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.verifySum(ChecksumFileSystem.java:254)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:211)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:167)
>       at 
> org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
>       at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>       at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
>       at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>       at java.io.DataInputStream.read(DataInputStream.java:132)
>       at 
> org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:1659)
>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
>       at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>       at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>       at 
> org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>       at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>       at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>       at 
> org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>       at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>       at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>       at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>       at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>       at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>       at 
> org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>       at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>       at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> 2007-03-26 00:02:57,083 WARN /: 
> /mapOutput?map=task_0002_m_022488_0&reduce=1542: 
> java.lang.NullPointerException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to