Ryan Blue created SQOOP-2001:
--------------------------------
Summary: Sqoop2 Kite connector might produce duplicate values when
retrying failed tasks.
Key: SQOOP-2001
URL: https://issues.apache.org/jira/browse/SQOOP-2001
Project: Sqoop
Issue Type: Bug
Reporter: Ryan Blue
This happens (as I understand things) because Kite may make files visible
before a task is completed or committed in the combined temporary dataset
directory. We should be able to avoid this by setting the temporary dataset's
writer cache limit to something huge - so files are not closed until the
overall writer is closed, no matter how many files are open.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)