[ https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489957#comment-13489957 ]
Hudson commented on SQOOP-604: ------------------------------ Integrated in Sqoop-ant-jdk-1.6-hadoop20 #262 (See [https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/262/]) SQOOP-604: Easy throttling feature for MySQL exports (Revision c499f49097ebf04f9fac34f1df768a319e679cea) Result = SUCCESS abhijeet : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=c499f49097ebf04f9fac34f1df768a319e679cea Files : * src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java > Easy throttling feature for MySQL exports > ----------------------------------------- > > Key: SQOOP-604 > URL: https://issues.apache.org/jira/browse/SQOOP-604 > Project: Sqoop > Issue Type: Improvement > Components: connectors/mysql > Affects Versions: 1.4.2 > Reporter: Zoltan Toth-Czifra > Priority: Minor > Fix For: 1.4.3 > > Attachments: SQOOP-604_v6.patch > > > Sqoop always tries to achieve the best possible throughput with exports, > which might not be desirable in all cases. Sometimes we need to export large > data with Sqoop to a live relational database (MySQL in our case), that is, a > database that is under a high load serving random queries from the users of > our product. > While data consistency issues during the export can be easily solved with a > staging table, there is still a problem: the performance impact caused by the > heavy export. > First off, the resources of MySQL dedicated to the import process can affect > the performance of the live product, both on the master and on the slaves. > Second, even if the servers can handle the import with no significant > performance impact (mysqlimport should be relatively "cheap"), importing big > tables (GB+) can cause serious replication lag in the cluster risking data > consistency. > My suggestion is quite simple. Using the already existing "checkpoint" > feature of the MySQL exports (the export process is restarted every X bytes > written), extending it with a new config value that would simply make the > thread sleep for X milliseconds at the checkbpoints. With low enough byte > count limit this can be a simple yet powerful throttling mechanism. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira