Jason Lowe created TEZ-3912:

             Summary: Fetchers should be more robust to corrupted inputs
                 Key: TEZ-3912
                 URL: https://issues.apache.org/jira/browse/TEZ-3912
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jason Lowe

I recently saw a case where a bad node in the cluster produced corrupted 
shuffle data that caused the codec to throw IllegalArgumentException when 
trying to fetch.  Fetchers currently only handle IOException and InternalError, 
and any other type of exception will cause the entire task to be torn down.  We 
should consider catching Exception like MapReduce does to be more robust in 
light of other types of errors coming from the codec and allow retries to occur.

This message was sent by Atlassian JIRA

Reply via email to