checksum exceptions on trunk
----------------------------
Key: HADOOP-2893
URL: https://issues.apache.org/jira/browse/HADOOP-2893
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Affects Versions: 0.17.0
Reporter: lohit vijayarenu
While running jobs like Sort/WordCount on trunk I see few task failures with
ChecksumException
Re-running the tasks on different nodes succeeds.
Here is the stack
{noformat}
Map output lost, rescheduling:
getMapOutput(task_200802251721_0004_m_000237_0,29) failed :
org.apache.hadoop.fs.ChecksumException: Checksum error:
/tmps/4/gs203240-29657-6751459769688273/mapred-tt/mapred-local/task_200802251721_0004_m_000237_0/file.out
at 2085376
at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:276)
at
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
at java.io.DataInputStream.read(DataInputStream.java:132)
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2299)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
at
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
at
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
at org.mortbay.http.HttpServer.service(HttpServer.java:954)
at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
{noformat}
another stack
{noformat}
Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error:
/tmps/4/gs203240-29657-6751459769688273/mapred-tt/mapred-local/task_200802251721_0004_r_000110_0/map_367.out
at 21884416
at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:276)
at
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
at org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:1930)
at
org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2958)
at
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.next(SequenceFile.java:2716)
at
org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:209)
at
org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:177)
... 5 more
{noformat}
both with local files
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.