[
https://issues.apache.org/jira/browse/SQOOP-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490745#comment-13490745
]
Zoltan Toth-Czifra edited comment on SQOOP-683 at 11/5/12 5:17 PM:
-------------------------------------------------------------------
{code}
diff --git a/src/docs/user/compatibility.txt b/src/docs/user/compatibility.txt
index 3576fd7..e8218d6 100644
--- a/src/docs/user/compatibility.txt
+++ b/src/docs/user/compatibility.txt
@@ -138,9 +138,30 @@ bytes. Set _size_ to 0 to disable intermediate checkpoints,
but individual files being exported will continue to be committed
independently of one another.
+Sometimes you need to export large data with Sqoop to a live MySQL cluster that
+is under a high load serving random queries from the users of our product.
+While data consistency issues during the export can be easily solved with a
+staging table, there is still a problem: the performance impact caused by the
+heavy export.
+
+First off, the resources of MySQL dedicated to the import process can affect
+the performance of the live product, both on the master and on the slaves.
+Second, even if the servers can handle the import with no significant
+performance impact (mysqlimport should be relatively "cheap"), importing big
+tables can cause serious replication lag in the cluster risking data
+inconsistency.
+
+With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in
+milliseconds, you can let the server relax between checkpoints and the replicas
+catch up by pausing the export process after transferring the number of bytes
+specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different
+settings of these two parameters to archieve an export pace that doesn't
+endanger the stability of your MySQL cluster.
+
IMPORTANT: Note that any arguments to Sqoop that are of the form +-D
parameter=value+ are Hadoop _generic arguments_ and must appear before
any tool-specific arguments (for example, +\--connect+, +\--table+, etc).
+Don't forget that these parameters only work with the +\--direct+ flag set.
PostgreSQL
~~~~~~~~~~
{code}
was (Author: tcz):
diff --git a/src/docs/user/compatibility.txt
b/src/docs/user/compatibility.txt
index 3576fd7..e8218d6 100644
--- a/src/docs/user/compatibility.txt
+++ b/src/docs/user/compatibility.txt
@@ -138,9 +138,30 @@ bytes. Set _size_ to 0 to disable intermediate checkpoints,
but individual files being exported will continue to be committed
independently of one another.
+Sometimes you need to export large data with Sqoop to a live MySQL cluster that
+is under a high load serving random queries from the users of our product.
+While data consistency issues during the export can be easily solved with a
+staging table, there is still a problem: the performance impact caused by the
+heavy export.
+
+First off, the resources of MySQL dedicated to the import process can affect
+the performance of the live product, both on the master and on the slaves.
+Second, even if the servers can handle the import with no significant
+performance impact (mysqlimport should be relatively "cheap"), importing big
+tables can cause serious replication lag in the cluster risking data
+inconsistency.
+
+With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in
+milliseconds, you can let the server relax between checkpoints and the replicas
+catch up by pausing the export process after transferring the number of bytes
+specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different
+settings of these two parameters to archieve an export pace that doesn't
+endanger the stability of your MySQL cluster.
+
IMPORTANT: Note that any arguments to Sqoop that are of the form +-D
parameter=value+ are Hadoop _generic arguments_ and must appear before
any tool-specific arguments (for example, +\--connect+, +\--table+, etc).
+Don't forget that these parameters only work with the +\--direct+ flag set.
PostgreSQL
~~~~~~~~~~
> Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct
> MySQL exports
> ------------------------------------------------------------------------------------------
>
> Key: SQOOP-683
> URL: https://issues.apache.org/jira/browse/SQOOP-683
> Project: Sqoop
> Issue Type: Sub-task
> Components: connectors/mysql, docs
> Affects Versions: 1.4.2
> Reporter: Zoltan Toth-Czifra
> Assignee: Zoltan Toth-Czifra
> Priority: Trivial
> Fix For: 1.4.3
>
>
> Documenting feature added in parent task.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira