[ 
https://issues.apache.org/jira/browse/SQOOP-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Mercier updated SQOOP-3161:
----------------------------------
    Description: 
Hello everyone, 

Improvement of --split-by :

To import a large table I parallelise the import with multiple mappers.
But when the splitting column is not evenly distributed, and there is no other 
column to use, the load is handle by few mappers only.

Is there a way to distribute equally the load between the multiples mappers 
with the same numbers of rows (not the min and max of a column that cause the 
uneven distribution) ?

Thank you.



  was:
Hello everyone, 

Improvement of --split-by :

To import a large table I parallelise the import with multiple mappers.
But when the splitting column is not evenly distributed, and there is no other 
column to use, the load is handle by few mappers only.

Is there a way to distribute equally the load between the multiples mappers 
with the actual numbers of rows (not the min and max of a column) ?

Thank you.




> Import : Controlling Parallelism with Splitting column not evenly distributed
> -----------------------------------------------------------------------------
>
>                 Key: SQOOP-3161
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3161
>             Project: Sqoop
>          Issue Type: Improvement
>    Affects Versions: 1.4.6
>            Reporter: Romain Mercier
>            Priority: Minor
>              Labels: performance
>
> Hello everyone, 
> Improvement of --split-by :
> To import a large table I parallelise the import with multiple mappers.
> But when the splitting column is not evenly distributed, and there is no 
> other column to use, the load is handle by few mappers only.
> Is there a way to distribute equally the load between the multiples mappers 
> with the same numbers of rows (not the min and max of a column that cause the 
> uneven distribution) ?
> Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to