[ https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456952#comment-13456952 ]
Jarek Jarcec Cecho edited comment on SQOOP-604 at 9/17/12 11:01 PM: -------------------------------------------------------------------- Hi Zoltan, you've took interesting approach. Would you mind putting your patch to our review board (https://reviews.apache.org )? Jarcec was (Author: jarcec): Hi Zoltan, you've took interesting approach. Would you mind putting your patch to our review board (https://reviews.apache.org)? Jarcec > Easy throttling feature for MySQL exports > ----------------------------------------- > > Key: SQOOP-604 > URL: https://issues.apache.org/jira/browse/SQOOP-604 > Project: Sqoop > Issue Type: Improvement > Components: connectors/mysql > Affects Versions: 1.4.3 > Reporter: Zoltan Toth-Czifra > Priority: Minor > Fix For: 1.4.3 > > > Sqoop always tries to achieve the best possible throughput with exports, > which might not be desirable in all cases. Sometimes we need to export large > data with Sqoop to a live relational database (MySQL in our case), that is, a > database that is under a high load serving random queries from the users of > our product. > While data consistency issues during the export can be easily solved with a > staging table, there is still a problem: the performance impact caused by the > heavy export. > First off, the resources of MySQL dedicated to the import process can affect > the performance of the live product, both on the master and on the slaves. > Second, even if the servers can handle the import with no significant > performance impact (mysqlimport should be relatively "cheap"), importing big > tables (GB+) can cause serious replication lag in the cluster risking data > consistency. > My suggestion is quite simple. Using the already existing "checkpoint" > feature of the MySQL exports (the export process is restarted every X bytes > written), extending it with a new config value that would simply make the > thread sleep for X milliseconds at the checkbpoints. With low enough byte > count limit this can be a simple yet powerful throttling mechanism. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira