[ https://issues.apache.org/jira/browse/HADOOP-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713157#action_12713157 ]
Aaron Kimball commented on HADOOP-5844: --------------------------------------- As a side note, this patch fixes a bug in HADOOP-5815 wherein sqoop did not add sqoop.jar itself to the classpath to pass to javac. As a result, compilation of generated code only worked in unit test mode (which made direct references to the .class files in the build directory), or when sqoop.jar was present in $HADOOP_HOME/lib/ (the contents of which were passed to javac). With this patch, generated code compilation works regardless of the location of sqoop.jar. > Use mysqldump when connecting to local mysql instance in Sqoop > -------------------------------------------------------------- > > Key: HADOOP-5844 > URL: https://issues.apache.org/jira/browse/HADOOP-5844 > Project: Hadoop Core > Issue Type: New Feature > Reporter: Aaron Kimball > Assignee: Aaron Kimball > Attachments: mysqldump.patch > > > Sqoop uses MapReduce + DBInputFormat to read the contents of a table into > HDFS. On many databases, this implementation is O(N^2) in the number of rows. > Also, the use of multiple mappers has low value in terms of throughput, > because the database itself is inherently singlethreaded. While > DBInputFormat/JDBC provides a useful fallback mechanism for importing from > databases, db-specific dump utilities will nearly always provide faster > throughput, and should be selected when available. This patch allows users to > use mysqldump to read from local mysql instances instead of the > MapReduce-based input. > If you provide sqoop with arguments of the form " --connect > jdbc:mysql://localhost/somedatabase --local", it will use the mysqldump fast > path to perform the import. > This patch, naturally, requires that MySQL be installed on a machine to test > it. Thus the test that this adds is called LocalMySQLTest (instead of the > Hadoop-preferred file naming, TestLocalMySQL) so that Hudson doesn't > automatically run it. You can run this test yourself by using "ant > -Dtestcase=LocalMySQLTest test". See the notes in the javadoc for the > LocalMySQLTest class in how to set up the MySQL test environment for this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.