[ 
https://issues.apache.org/jira/browse/SQOOP-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490745#comment-13490745
 ] 

Zoltan Toth-Czifra edited comment on SQOOP-683 at 11/5/12 5:17 PM:
-------------------------------------------------------------------

{code}
diff --git a/src/docs/user/compatibility.txt b/src/docs/user/compatibility.txt
index 3576fd7..e8218d6 100644
--- a/src/docs/user/compatibility.txt
+++ b/src/docs/user/compatibility.txt
@@ -138,9 +138,30 @@ bytes. Set _size_ to 0 to disable intermediate checkpoints,
 but individual files being exported will continue to be committed
 independently of one another.
 
+Sometimes you need to export large data with Sqoop to a live MySQL cluster that
+is under a high load serving random queries from the users of our product.
+While data consistency issues during the export can be easily solved with a
+staging table, there is still a problem: the performance impact caused by the
+heavy export.
+
+First off, the resources of MySQL dedicated to the import process can affect
+the performance of the live product, both on the master and on the slaves.
+Second, even if the servers can handle the import with no significant
+performance impact (mysqlimport should be relatively "cheap"), importing big
+tables can cause serious replication lag in the cluster risking data
+inconsistency.
+
+With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in
+milliseconds, you can let the server relax between checkpoints and the replicas
+catch up by pausing the export process after transferring the number of bytes
+specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different
+settings of these two parameters to archieve an export pace that doesn't
+endanger the stability of your MySQL cluster.
+
 IMPORTANT: Note that any arguments to Sqoop that are of the form +-D
 parameter=value+ are Hadoop _generic arguments_ and must appear before
 any tool-specific arguments (for example, +\--connect+, +\--table+, etc).
+Don't forget that these parameters only work with the +\--direct+ flag set.
 
 PostgreSQL
 ~~~~~~~~~~
{code}
                
      was (Author: tcz):
    diff --git a/src/docs/user/compatibility.txt 
b/src/docs/user/compatibility.txt
index 3576fd7..e8218d6 100644
--- a/src/docs/user/compatibility.txt
+++ b/src/docs/user/compatibility.txt
@@ -138,9 +138,30 @@ bytes. Set _size_ to 0 to disable intermediate checkpoints,
 but individual files being exported will continue to be committed
 independently of one another.
 
+Sometimes you need to export large data with Sqoop to a live MySQL cluster that
+is under a high load serving random queries from the users of our product.
+While data consistency issues during the export can be easily solved with a
+staging table, there is still a problem: the performance impact caused by the
+heavy export.
+
+First off, the resources of MySQL dedicated to the import process can affect
+the performance of the live product, both on the master and on the slaves.
+Second, even if the servers can handle the import with no significant
+performance impact (mysqlimport should be relatively "cheap"), importing big
+tables can cause serious replication lag in the cluster risking data
+inconsistency.
+
+With +-D sqoop.mysql.export.sleep.ms=time+, where _time_ is a value in
+milliseconds, you can let the server relax between checkpoints and the replicas
+catch up by pausing the export process after transferring the number of bytes
+specified in +sqoop.mysql.export.checkpoint.bytes+. Experiment with different
+settings of these two parameters to archieve an export pace that doesn't
+endanger the stability of your MySQL cluster.
+
 IMPORTANT: Note that any arguments to Sqoop that are of the form +-D
 parameter=value+ are Hadoop _generic arguments_ and must appear before
 any tool-specific arguments (for example, +\--connect+, +\--table+, etc).
+Don't forget that these parameters only work with the +\--direct+ flag set.
 
 PostgreSQL
 ~~~~~~~~~~

                  
> Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct 
> MySQL exports
> ------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-683
>                 URL: https://issues.apache.org/jira/browse/SQOOP-683
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: connectors/mysql, docs
>    Affects Versions: 1.4.2
>            Reporter: Zoltan Toth-Czifra
>            Assignee: Zoltan Toth-Czifra
>            Priority: Trivial
>             Fix For: 1.4.3
>
>
> Documenting feature added in parent task.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to