The reason I filed this bug is that I believe one of the following guys
 at org.apache.hadoop.streaming.PipeMapper.map
        at org.apache.hadoop.mapred.MapRunner.run
should catch the exception and explain to the user what it thinks has happened -- e.g. to show how many records has beed buffeed but not consumed, what was the first / last record, etc.

Throwing an exception is rude.

On Jan 24, 2008, at 10:03 AM, Runping Qi (JIRA) wrote:


[ https://issues.apache.org/jira/browse/HADOOP-2438? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel&focusedCommentId=12562157#action_12562157 ]

Runping Qi commented on HADOOP-2438:
------------------------------------


If the stream mapper stalled for some reason and cannot consume the std input, while the Java MapRed wrapper continues to pipe to the mapper, then maybe too much data will be
accumulated in the std input pipe.
That may cause broken pipe or oom exception. What did the mapper do?





In streaming, jobs that used to work, crash in the map phase -- even if the mapper is /bin/cat --------------------------------------------------------------------- -------------------------

                Key: HADOOP-2438
URL: https://issues.apache.org/jira/browse/ HADOOP-2438
            Project: Hadoop Core
         Issue Type: Bug
   Affects Versions: 0.15.1
           Reporter: arkady borkovsky

The exception is either "out of memory" of or "broken pipe" -- see both stack dumps bellow.
st Hadoop input: |null|
last tool output: |[EMAIL PROTECTED]|
Date: Sat Dec 15 21:02:18 UTC 2007
java.io.IOException: Broken pipe
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
at java.io.BufferedOutputStream.flushBuffer (BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush (BufferedOutputStream.java:123) at java.io.BufferedOutputStream.flush (BufferedOutputStream.java:124)
        at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.apache.hadoop.streaming.PipeMapper.map (PipeMapper.java:96)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at org.apache.hadoop.mapred.TaskTracker$Child.main (TaskTracker.java:1760) at org.apache.hadoop.streaming.PipeMapper.map (PipeMapper.java:107)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at org.apache.hadoop.mapred.TaskTracker$Child.main (TaskTracker.java:1760)
-------------------------------------------------
java.io.IOException: MROutput/MRErrThread
failed:java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write (ByteArrayOutputStream.java:94)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.io.Text.write(Text.java:243)
at org.apache.hadoop.mapred.MapTask $MapOutputBuffer.collect (MapTask.java:347) at org.apache.hadoop.streaming.PipeMapRed $MROutputThread.run (PipeMapRed.java:344) at org.apache.hadoop.streaming.PipeMapper.map (PipeMapper.java:76)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
at org.apache.hadoop.mapred.TaskTracker$Child.main (TaskTracker.java:
1760)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to