Rohini Palaniswamy created TEZ-2192:
---------------------------------------
Summary: Relocalization does not check for source
Key: TEZ-2192
URL: https://issues.apache.org/jira/browse/TEZ-2192
Project: Apache Tez
Issue Type: Bug
Reporter: Rohini Palaniswamy
PIG-4443 spills the input splits to disk if serialized split size is greater
than some threshold. It faces issues with relocalization when more than one
vertex has job.split file. If a job.split file is already there on container
reuse, it is reused causing wrong data to be read.
Either need a way to turn off relocalization or check the source+timestamp and
redownload the file during relocalization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)