[ 
https://issues.apache.org/jira/browse/PIG-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561883#action_12561883
 ] 

Benjamin Reed commented on PIG-71:
----------------------------------

Does Hadoop really expect every InputFormat implementation to filter these 
files? Wouldn't it be better to stage inprocess files in a temp directory and 
only move to the final directory when they are finished. It's hard to believe 
that Hadoop wants all InputFormats to fix the problem rather than fix the 
problem once inside of Hadoop.

> Support for Hadoop Speculative Execution
> ----------------------------------------
>
>                 Key: PIG-71
>                 URL: https://issues.apache.org/jira/browse/PIG-71
>             Project: Pig
>          Issue Type: New Feature
>         Environment: Hadoop
>            Reporter: Amir Youssefi
>            Priority: Minor
>
> If Speculative Execution is used in Hadoop while creating a data-set then Pig 
> scripts loading this data-set may fail. Reason is temp directories generated 
> in the process. 
> Pig can filter out these temp directories and problem gets solved. Here is 
> sample error:
> [main] ERROR org.apache.pig - Error message from task (map) 
> tip_..._0001_m_002735 java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
>         at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1524)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.java:1590)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1626)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1712)
>         at 
> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:79)
>         ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to