Cheolsoo Park created SQOOP-513:
-----------------------------------
Summary: Provide a way to override the default splitter
Key: SQOOP-513
URL: https://issues.apache.org/jira/browse/SQOOP-513
Project: Sqoop
Issue Type: Improvement
Affects Versions: 1.4.1-incubating
Reporter: Cheolsoo Park
when the number of mappers is greater than 1, Sqoop divides rows using simple
queries such as:
{code}
select x, y from foo where x > 10 and x <= 20.
{code}
The ranges are computed simply by (max - min) / # of mappers. This works fine
if values of the split-by column are distributed evenly; however, it doesn't
work well with skewed distribution, for example.
The proposal is to provide a way so that the user can override the default
splitter. For example, the user should be able to write their own splitter
class, pass the class name via a command option, and use that splitter at
runtime instead of the default splitter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira