[
https://issues.apache.org/jira/browse/PIG-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561883#action_12561883
]
Benjamin Reed commented on PIG-71:
----------------------------------
Does Hadoop really expect every InputFormat implementation to filter these
files? Wouldn't it be better to stage inprocess files in a temp directory and
only move to the final directory when they are finished. It's hard to believe
that Hadoop wants all InputFormats to fix the problem rather than fix the
problem once inside of Hadoop.
> Support for Hadoop Speculative Execution
> ----------------------------------------
>
> Key: PIG-71
> URL: https://issues.apache.org/jira/browse/PIG-71
> Project: Pig
> Issue Type: New Feature
> Environment: Hadoop
> Reporter: Amir Youssefi
> Priority: Minor
>
> If Speculative Execution is used in Hadoop while creating a data-set then Pig
> scripts loading this data-set may fail. Reason is temp directories generated
> in the process.
> Pig can filter out these temp directories and problem gets solved. Here is
> sample error:
> [main] ERROR org.apache.pig - Error message from task (map)
> tip_..._0001_m_002735 java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
> at
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
> at
> org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1524)
> at
> org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1590)
> at
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1626)
> at
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1712)
> at
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:79)
> ...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.