[
https://issues.apache.org/jira/browse/SQOOP-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296495#comment-14296495
]
Hudson commented on SQOOP-2055:
-------------------------------
SUCCESS: Integrated in Sqoop-ant-jdk-1.6-hadoop23 #1164 (See
[https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/1164/])
SQOOP-2055: Run only one map task attempt during export (venkat:
https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=420fc3d53f1db62710710b93b9801cff5e4d1b53)
* src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
> Run only one map task attempt during export
> -------------------------------------------
>
> Key: SQOOP-2055
> URL: https://issues.apache.org/jira/browse/SQOOP-2055
> Project: Sqoop
> Issue Type: Bug
> Affects Versions: 1.4.5
> Reporter: Jarek Jarcec Cecho
> Assignee: Jarek Jarcec Cecho
> Fix For: 1.4.6
>
> Attachments: SQOOP-2055.patch, SQOOP-2055.patch
>
>
> While investigating several user issues, I've noticed that our [documentation
> is stating that on export mapper failure we fail the entire
> job|http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_failed_exports]:
> {quote}
> If an export map task fails due to these or other reasons, it will cause the
> export job to fail. The results of a failed export are undefined. Each export
> map task operates in a separate transaction. Furthermore, individual map
> tasks commit their current transaction periodically. If a task fails, the
> current transaction will be rolled back. Any previously-committed
> transactions will remain durable in the database, leading to a
> partially-complete export.
> {quote}
> This is however not the observed behavior as mapreduce will re-run failed
> mapper again (up to 3 times) before failing the job. This is confusing while
> investigating failures because most often one have to go to the first failed
> attempt and ignore the rest as they are usually failing on unrelated issues
> (key constraints).
> It seems that some of the connectors are smart enough to either suggest user
> to configure MR or do it automatically
> ([PGDump|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/postgresql/PGBulkloadExportJob.java#L139],
>
> [OraOop|https://github.com/apache/sqoop/blob/trunk/src/docs/user/connectors.txt#L831]).
> I would like to propose to apply this behavior on every export job as that
> seem as a more reasonable default for export job.
> Doing this might have a side effect on more advanced connectors that have
> each mapper attempt idempotent (e.g. they are using temporary tables per map
> attempt or similar facility) in the sense that we stop re-running their
> failed attempts automatically and those connectors will have to re-enable
> this behavior on their own.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)