----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24223/ -----------------------------------------------------------
(Updated Aug. 5, 2014, 6:25 a.m.) Review request for Sqoop. Changes ------- provided an updated patch with unit tests Repository: sqoop-trunk Description ------- The patch proposes to add the possibility to import an individual table from a RDBMS into HDFS as a set of Parquet files. It also supports a command-line interface with a new argument `--as-parquetfile` Example invocation: `sqoop import --connect JDBC_URI --table TABLE --as-parquetfile --target-dir /path/to/files` The major items are listed as follows: *Implement `ParquetImportMapper` *Hook up the `ParquetOutputFormat` and `ParquetImportMapper` in the import job. As Parquet is a columnar storage format, it doesn't make sense to write to it directly from record-based tools. We've considered of using Kite SDK to simplify the handling of Parquet specific things. The major idea is to convert `SqoopRecord` as `GenericRecord` and write them into a Kite dataset. Kite SDK will convert these records to as a set of Parquet files. Diffs (updated) ----- ivy.xml abc12a1 ivy/libraries.properties a59471e pom-old.xml a8f4361 src/docs/man/import-args.txt a4ce4ec src/docs/man/sqoop-import-all-tables.txt 6b639f5 src/docs/user/hcatalog.txt cd1dde3 src/docs/user/help.txt a9e1e89 src/docs/user/import-all-tables.txt 60645f1 src/docs/user/import.txt 192e97e src/java/com/cloudera/sqoop/SqoopOptions.java ffec2dc src/java/com/cloudera/sqoop/mapreduce/ParquetImportMapper.java PRE-CREATION src/java/com/cloudera/sqoop/mapreduce/ParquetOutputFormat.java PRE-CREATION src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java a5f72f7 src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 6dcfebb src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java PRE-CREATION src/java/org/apache/sqoop/mapreduce/ParquetJob.java PRE-CREATION src/java/org/apache/sqoop/mapreduce/ParquetOutputFormat.java PRE-CREATION src/java/org/apache/sqoop/tool/BaseSqoopTool.java b77b1ea src/java/org/apache/sqoop/tool/ImportTool.java a3a2d0d src/licenses/LICENSE-BIN.txt 4215d26 src/test/com/cloudera/sqoop/TestParquetImport.java PRE-CREATION Diff: https://reviews.apache.org/r/24223/diff/ Testing ------- Manually tested with a MySQL database. Unit tests are being developed yet. Thanks, Qian Xu
