-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24223/#review49642
-----------------------------------------------------------


First pass... comments below!


ivy.xml
<https://reviews.apache.org/r/24223/#comment86918>

    Do we need to include kitesdk for hadoop1 and hadoop2? See avro dependency 
for an example of how to do this if we do need to.



pom-old.xml
<https://reviews.apache.org/r/24223/#comment86916>

    The dependencies can exist in ivy only. There's no need to include in this 
pom file.



pom-old.xml
<https://reviews.apache.org/r/24223/#comment86917>

    Same as above.



src/java/com/cloudera/sqoop/mapreduce/ParquetImportMapper.java
<https://reviews.apache.org/r/24223/#comment86890>

    com.cloudera.x is deprecated. No need to provide.



src/java/com/cloudera/sqoop/mapreduce/ParquetOutputFormat.java
<https://reviews.apache.org/r/24223/#comment86891>

    com.cloudera.x is deprecated. No need to provide.



src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java
<https://reviews.apache.org/r/24223/#comment86889>

    You can get rid of this. The com.cloudera.x packages are not maintained any 
more.



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
<https://reviews.apache.org/r/24223/#comment86892>

    This is a bit confusing... could you add a few comments as to why an Avro 
schema would be used with the ParquetJob?



src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java
<https://reviews.apache.org/r/24223/#comment86898>

    I don't believe this is possible. Perhaps you were looking for "Boolean"?


- Abraham Elmahrek


On Aug. 5, 2014, 6:25 a.m., Qian Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24223/
> -----------------------------------------------------------
> 
> (Updated Aug. 5, 2014, 6:25 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> The patch proposes to add the possibility to import an individual table from 
> a RDBMS into HDFS as a set of Parquet files. It also supports a command-line 
> interface with a new argument `--as-parquetfile`
> Example invocation: `sqoop import --connect JDBC_URI --table TABLE 
> --as-parquetfile --target-dir /path/to/files`
> 
> The major items are listed as follows:
> *Implement `ParquetImportMapper`
> *Hook up the `ParquetOutputFormat` and `ParquetImportMapper` in the import 
> job.
> 
> As Parquet is a columnar storage format, it doesn't make sense to write to it 
> directly from record-based tools. We've considered of using Kite SDK to 
> simplify the handling of Parquet specific things. The major idea is to 
> convert `SqoopRecord` as `GenericRecord` and write them into a Kite dataset. 
> Kite SDK will convert these records to as a set of Parquet files.
> 
> 
> Diffs
> -----
> 
>   ivy.xml abc12a1 
>   ivy/libraries.properties a59471e 
>   pom-old.xml a8f4361 
>   src/docs/man/import-args.txt a4ce4ec 
>   src/docs/man/sqoop-import-all-tables.txt 6b639f5 
>   src/docs/user/hcatalog.txt cd1dde3 
>   src/docs/user/help.txt a9e1e89 
>   src/docs/user/import-all-tables.txt 60645f1 
>   src/docs/user/import.txt 192e97e 
>   src/java/com/cloudera/sqoop/SqoopOptions.java ffec2dc 
>   src/java/com/cloudera/sqoop/mapreduce/ParquetImportMapper.java PRE-CREATION 
>   src/java/com/cloudera/sqoop/mapreduce/ParquetOutputFormat.java PRE-CREATION 
>   src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java a5f72f7 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 6dcfebb 
>   src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/ParquetJob.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/ParquetOutputFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java b77b1ea 
>   src/java/org/apache/sqoop/tool/ImportTool.java a3a2d0d 
>   src/licenses/LICENSE-BIN.txt 4215d26 
>   src/test/com/cloudera/sqoop/TestParquetImport.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24223/diff/
> 
> 
> Testing
> -------
> 
> Manually tested with a MySQL database. Unit tests are being developed yet.
> 
> 
> Thanks,
> 
> Qian Xu
> 
>

Reply via email to