[jira] [Commented] (SQOOP-604) Easy throttling feature for MySQL exports

2012-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489952#comment-13489952
 ] 

Hudson commented on SQOOP-604:
--

Integrated in Sqoop-ant-jdk-1.6-hadoop200 #269 (See 
[https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/269/])
SQOOP-604: Easy throttling feature for MySQL exports (Revision 
c499f49097ebf04f9fac34f1df768a319e679cea)

 Result = SUCCESS
abhijeet : 
https://git-wip-us.apache.org/repos/asf?p=sqoop.gita=commith=c499f49097ebf04f9fac34f1df768a319e679cea
Files : 
* src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java


 Easy throttling feature for MySQL exports
 -

 Key: SQOOP-604
 URL: https://issues.apache.org/jira/browse/SQOOP-604
 Project: Sqoop
  Issue Type: Improvement
  Components: connectors/mysql
Affects Versions: 1.4.2
Reporter: Zoltan Toth-Czifra
Priority: Minor
 Fix For: 1.4.3

 Attachments: SQOOP-604_v6.patch


 Sqoop always tries to achieve the best possible throughput with exports, 
 which might not be desirable in all cases. Sometimes we need to export large 
 data with Sqoop to a live relational database (MySQL in our case), that is, a 
 database that is under a high load serving random queries from the users of 
 our product.
 While data consistency issues during the export can be easily solved with a 
 staging table, there is still a problem: the performance impact caused by the 
 heavy export. 
 First off, the resources of MySQL dedicated to the import process can affect 
 the performance of the live product, both on the master and on the slaves. 
 Second, even if the servers can handle the import with no significant 
 performance impact (mysqlimport should be relatively cheap), importing big 
 tables (GB+) can cause serious replication lag in the cluster risking data 
 consistency.
 My suggestion is quite simple. Using the already existing checkpoint 
 feature of the MySQL exports (the export process is restarted every X bytes 
 written), extending it with a new config value that would simply make the 
 thread sleep for X milliseconds at the checkbpoints. With low enough byte 
 count limit this can be a simple yet powerful throttling mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (SQOOP-604) Easy throttling feature for MySQL exports

2012-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489954#comment-13489954
 ] 

Hudson commented on SQOOP-604:
--

Integrated in Sqoop-ant-jdk-1.6-hadoop100 #259 (See 
[https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/259/])
SQOOP-604: Easy throttling feature for MySQL exports (Revision 
c499f49097ebf04f9fac34f1df768a319e679cea)

 Result = SUCCESS
abhijeet : 
https://git-wip-us.apache.org/repos/asf?p=sqoop.gita=commith=c499f49097ebf04f9fac34f1df768a319e679cea
Files : 
* src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java


 Easy throttling feature for MySQL exports
 -

 Key: SQOOP-604
 URL: https://issues.apache.org/jira/browse/SQOOP-604
 Project: Sqoop
  Issue Type: Improvement
  Components: connectors/mysql
Affects Versions: 1.4.2
Reporter: Zoltan Toth-Czifra
Priority: Minor
 Fix For: 1.4.3

 Attachments: SQOOP-604_v6.patch


 Sqoop always tries to achieve the best possible throughput with exports, 
 which might not be desirable in all cases. Sometimes we need to export large 
 data with Sqoop to a live relational database (MySQL in our case), that is, a 
 database that is under a high load serving random queries from the users of 
 our product.
 While data consistency issues during the export can be easily solved with a 
 staging table, there is still a problem: the performance impact caused by the 
 heavy export. 
 First off, the resources of MySQL dedicated to the import process can affect 
 the performance of the live product, both on the master and on the slaves. 
 Second, even if the servers can handle the import with no significant 
 performance impact (mysqlimport should be relatively cheap), importing big 
 tables (GB+) can cause serious replication lag in the cluster risking data 
 consistency.
 My suggestion is quite simple. Using the already existing checkpoint 
 feature of the MySQL exports (the export process is restarted every X bytes 
 written), extending it with a new config value that would simply make the 
 thread sleep for X milliseconds at the checkbpoints. With low enough byte 
 count limit this can be a simple yet powerful throttling mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (SQOOP-604) Easy throttling feature for MySQL exports

2012-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489955#comment-13489955
 ] 

Hudson commented on SQOOP-604:
--

Integrated in Sqoop-ant-jdk-1.6-hadoop23 #414 (See 
[https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/414/])
SQOOP-604: Easy throttling feature for MySQL exports (Revision 
c499f49097ebf04f9fac34f1df768a319e679cea)

 Result = SUCCESS
abhijeet : 
https://git-wip-us.apache.org/repos/asf?p=sqoop.gita=commith=c499f49097ebf04f9fac34f1df768a319e679cea
Files : 
* src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java


 Easy throttling feature for MySQL exports
 -

 Key: SQOOP-604
 URL: https://issues.apache.org/jira/browse/SQOOP-604
 Project: Sqoop
  Issue Type: Improvement
  Components: connectors/mysql
Affects Versions: 1.4.2
Reporter: Zoltan Toth-Czifra
Priority: Minor
 Fix For: 1.4.3

 Attachments: SQOOP-604_v6.patch


 Sqoop always tries to achieve the best possible throughput with exports, 
 which might not be desirable in all cases. Sometimes we need to export large 
 data with Sqoop to a live relational database (MySQL in our case), that is, a 
 database that is under a high load serving random queries from the users of 
 our product.
 While data consistency issues during the export can be easily solved with a 
 staging table, there is still a problem: the performance impact caused by the 
 heavy export. 
 First off, the resources of MySQL dedicated to the import process can affect 
 the performance of the live product, both on the master and on the slaves. 
 Second, even if the servers can handle the import with no significant 
 performance impact (mysqlimport should be relatively cheap), importing big 
 tables (GB+) can cause serious replication lag in the cluster risking data 
 consistency.
 My suggestion is quite simple. Using the already existing checkpoint 
 feature of the MySQL exports (the export process is restarted every X bytes 
 written), extending it with a new config value that would simply make the 
 thread sleep for X milliseconds at the checkbpoints. With low enough byte 
 count limit this can be a simple yet powerful throttling mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (SQOOP-604) Easy throttling feature for MySQL exports

2012-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489957#comment-13489957
 ] 

Hudson commented on SQOOP-604:
--

Integrated in Sqoop-ant-jdk-1.6-hadoop20 #262 (See 
[https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/262/])
SQOOP-604: Easy throttling feature for MySQL exports (Revision 
c499f49097ebf04f9fac34f1df768a319e679cea)

 Result = SUCCESS
abhijeet : 
https://git-wip-us.apache.org/repos/asf?p=sqoop.gita=commith=c499f49097ebf04f9fac34f1df768a319e679cea
Files : 
* src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java


 Easy throttling feature for MySQL exports
 -

 Key: SQOOP-604
 URL: https://issues.apache.org/jira/browse/SQOOP-604
 Project: Sqoop
  Issue Type: Improvement
  Components: connectors/mysql
Affects Versions: 1.4.2
Reporter: Zoltan Toth-Czifra
Priority: Minor
 Fix For: 1.4.3

 Attachments: SQOOP-604_v6.patch


 Sqoop always tries to achieve the best possible throughput with exports, 
 which might not be desirable in all cases. Sometimes we need to export large 
 data with Sqoop to a live relational database (MySQL in our case), that is, a 
 database that is under a high load serving random queries from the users of 
 our product.
 While data consistency issues during the export can be easily solved with a 
 staging table, there is still a problem: the performance impact caused by the 
 heavy export. 
 First off, the resources of MySQL dedicated to the import process can affect 
 the performance of the live product, both on the master and on the slaves. 
 Second, even if the servers can handle the import with no significant 
 performance impact (mysqlimport should be relatively cheap), importing big 
 tables (GB+) can cause serious replication lag in the cluster risking data 
 consistency.
 My suggestion is quite simple. Using the already existing checkpoint 
 feature of the MySQL exports (the export process is restarted every X bytes 
 written), extending it with a new config value that would simply make the 
 thread sleep for X milliseconds at the checkbpoints. With low enough byte 
 count limit this can be a simple yet powerful throttling mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (SQOOP-604) Easy throttling feature for MySQL exports

2012-11-03 Thread Zoltan Toth-Czifra (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13489984#comment-13489984
 ] 

Zoltan Toth-Czifra commented on SQOOP-604:
--

Thank you guys! I promise next time it will go smoother :)

 Easy throttling feature for MySQL exports
 -

 Key: SQOOP-604
 URL: https://issues.apache.org/jira/browse/SQOOP-604
 Project: Sqoop
  Issue Type: Improvement
  Components: connectors/mysql
Affects Versions: 1.4.2
Reporter: Zoltan Toth-Czifra
Priority: Minor
 Fix For: 1.4.3

 Attachments: SQOOP-604_v6.patch


 Sqoop always tries to achieve the best possible throughput with exports, 
 which might not be desirable in all cases. Sometimes we need to export large 
 data with Sqoop to a live relational database (MySQL in our case), that is, a 
 database that is under a high load serving random queries from the users of 
 our product.
 While data consistency issues during the export can be easily solved with a 
 staging table, there is still a problem: the performance impact caused by the 
 heavy export. 
 First off, the resources of MySQL dedicated to the import process can affect 
 the performance of the live product, both on the master and on the slaves. 
 Second, even if the servers can handle the import with no significant 
 performance impact (mysqlimport should be relatively cheap), importing big 
 tables (GB+) can cause serious replication lag in the cluster risking data 
 consistency.
 My suggestion is quite simple. Using the already existing checkpoint 
 feature of the MySQL exports (the export process is restarted every X bytes 
 written), extending it with a new config value that would simply make the 
 thread sleep for X milliseconds at the checkbpoints. With low enough byte 
 count limit this can be a simple yet powerful throttling mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: SQOOP-604 Easy throttling feature for MySQL exports

2012-11-03 Thread Zoltán Tóth-Czifra


 On Nov. 3, 2012, 5:18 a.m., Abhijeet Gaikwad wrote:
  Looks good :)
  ant checkstyle - no errors
  ant test - success

Thank you for your help Abhijeet!


- Zoltán


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7135/#review13075
---


On Nov. 2, 2012, 12:32 p.m., Zoltán Tóth-Czifra wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/7135/
 ---
 
 (Updated Nov. 2, 2012, 12:32 p.m.)
 
 
 Review request for Sqoop.
 
 
 Description
 ---
 
 Code review for SQOOP-604, see https://issues.apache.org/jira/browse/SQOOP-604
 
 The solution in short: Using the already existing checkpoint feature of the 
 direct (--direct) MySQL exports (the export process is restarted every X 
 bytes written), extending it with a new config value that would simply make 
 the thread sleep for X milliseconds at the checkbpoints. With low enough byte 
 count limit this can be a simple yet powerful throttling mechanism.
 
 
 Diffs
 -
 
   src/java/org/apache/sqoop/mapreduce/MySQLExportMapper.java a4e8b88 
 
 Diff: https://reviews.apache.org/r/7135/diff/
 
 
 Testing
 ---
 
 Executing with different settings of sqoop.mysql.export.checkpoint.bytes and 
 sqoop.mysql.export.sleep.ms:
 
 33554432B / 0ms: Transferred 4.7579 MB in 8.7175 seconds (558.8826 KB/sec)
 102400B / 500ms: Transferred 4.7579 MB in 35.7794 seconds (136.1698 KB/sec)
 51200B / 500ms: Transferred 4.758 MB in 57.8675 seconds (84.1959 KB/sec)
 51200B / 250ms: Transferred 4.7579 MB in 35.0293 seconds (139.0854 KB/sec)
 
 I did not add unit tests yet and as it involves calling to Thread.sleep, I 
 find testing this difficult. Unfortunately there is no machine or 
 environment object that could be injected to these classes as mocks that 
 could take care of time-related fixtures.
 
 
 Thanks,
 
 Zoltán Tóth-Czifra