[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1168:
-------------------------------------

    Attachment: MAPREDUCE-1168.patch

This patch provides Sqoop with the ability to export tables from HDFS to an 
external RDBMS. Sqoop runs a MapReduce job over the contents of a directory 
(identified by {{\-\-export-dir}}), parsing the records contained within based 
on the auto-generated class definition for a table. DBOutputFormat is used to 
inject the records back into the database table (specified by {{\-\-table}}). 
The table must already exist in the target database.

Sqoop can auto-generate the appropriate ORM class for parsing the input files 
by examining the target table (much as is done during importing); the existing 
command-line options that govern delimiters are used to specify which 
delimiters are used in the files to be exported.

If an ORM class has already been generated for the table, this can now be 
specified with the {{\-\-jar-file}} and {{\-\-class-name}} options; code 
auto-generation is bypassed in this case. (This applies to imports as well.)

Export supports both delimited text files as well as SequenceFiles containing 
{{SqoopRecords}} as values (i.e., SequenceFiles created via a Sqoop import with 
{{\-\-as-sequencefile}}). Users do not need to identify the file type; it is 
automatically inferred. Gzipped text files will be handled transparantly.

Testing has been performed via unit tests (included) against HSQLDB with 
several column datatypes. I performed manual larger-scale testing by exporting 
100MB and 500MB datasets containing 1- and 5 million rows respectively to 
tables in mysql.

> Export data to databases via Sqoop
> ----------------------------------
>
>                 Key: MAPREDUCE-1168
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1168
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/sqoop
>            Reporter: Aaron Kimball
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-1168.patch
>
>
> Sqoop can import from a database into HDFS. It's high time it works in 
> reverse too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to