[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1473:
-------------------------------------

    Attachment: MAPREDUCE-1473.patch

Attaching a patch which provides this functionality. This uses 
CombineFileInputFormat to batch up Sqoop's input files into a user-defined 
number of splits.

As in importing, the degree of parallelism is controlled with the {{\-m}} / 
{{--num-mappers}} parameters.

> Sqoop should allow users to control export parallelism
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-1473
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1473
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/sqoop
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-1473.patch
>
>
> Sqoop uses MapReduce jobs to export files back to a table in the database. 
> The degree of parallelism is controlled by the number of splits; i.e., the 
> number of input files used. The bottleneck in the system, though, is likely 
> to be the database itself.
> Users should have the ability to tune the number of parallel exporters being 
> used to a degree appropriate to their database deployment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to