Updated Branches: refs/heads/trunk ebeb93351 -> da16aa5dc
SQOOP-683: Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports (Zoltan Toth-Czifra via Jarek Jarcec Cecho) Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/da16aa5d Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/da16aa5d Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/da16aa5d Branch: refs/heads/trunk Commit: da16aa5dc0fd6f980288564b1e5160ca89b3a8eb Parents: ebeb933 Author: Jarek Jarcec Cecho <[email protected]> Authored: Fri Nov 9 09:07:12 2012 -0800 Committer: Jarek Jarcec Cecho <[email protected]> Committed: Fri Nov 9 09:07:12 2012 -0800 ---------------------------------------------------------------------- src/docs/user/compatibility.txt | 22 ++++++++++++++++++++++ 1 files changed, 22 insertions(+), 0 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/sqoop/blob/da16aa5d/src/docs/user/compatibility.txt ---------------------------------------------------------------------- diff --git a/src/docs/user/compatibility.txt b/src/docs/user/compatibility.txt index 3576fd7..37e07b2 100644 --- a/src/docs/user/compatibility.txt +++ b/src/docs/user/compatibility.txt @@ -138,9 +138,31 @@ bytes. Set _size_ to 0 to disable intermediate checkpoints, but individual files being exported will continue to be committed independently of one another. +Sometimes you need to export large data with Sqoop to a live MySQL cluster that +is under a high load serving random queries from the users of your application. +While data consistency issues during the export can be easily solved with a +staging table, there is still a problem with the performance impact caused by +the heavy export. + +First off, the resources of MySQL dedicated to the import process can affect +the performance of the live product, both on the master and on the slaves. +Second, even if the servers can handle the import with no significant +performance impact (mysqlimport should be relatively "cheap"), importing big +tables can cause serious replication lag in the cluster risking data +inconsistency. + +With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in +milliseconds, you can let the server relax between checkpoints and the replicas +catch up by pausing the export process after transferring the number of bytes +specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different +settings of these two parameters to archieve an export pace that doesn't +endanger the stability of your MySQL cluster. + IMPORTANT: Note that any arguments to Sqoop that are of the form +-D parameter=value+ are Hadoop _generic arguments_ and must appear before any tool-specific arguments (for example, +\--connect+, +\--table+, etc). +Don't forget that these parameters are only supported with the +\--direct+ +flag set. PostgreSQL ~~~~~~~~~~
