Ryan Blue created SQOOP-2001:
--------------------------------

             Summary: Sqoop2 Kite connector might produce duplicate values when 
retrying failed tasks.
                 Key: SQOOP-2001
                 URL: https://issues.apache.org/jira/browse/SQOOP-2001
             Project: Sqoop
          Issue Type: Bug
            Reporter: Ryan Blue


This happens (as I understand things) because Kite may make files visible 
before a task is completed or committed in the combined temporary dataset 
directory. We should be able to avoid this by setting the temporary dataset's 
writer cache limit to something huge - so files are not closed until the 
overall writer is closed, no matter how many files are open.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to