-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11588/#review21503
-----------------------------------------------------------


Hi Vasanth,
thank you very much for working on this! I do have couple of additional notes:

I do have a small concerns about the test case TestHdfsExtract that seems to be 
written in a way that the context objects are not passed into the job conf. I 
think that this is artifact from early days that we should fix in a nice manner 
as is very likely that we will have to add more inputs in the future.


core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java
<https://reviews.apache.org/r/11588/#comment44503>

    It seems to me that using Enum for simple boolean input is unnecessary 
overhead. We are adding support for Boolen model type in SQOOP-1076, but the 
patch is not yet committed. Do you think that it would make sense to wait and 
use that instead of the Enum?



execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
<https://reviews.apache.org/r/11588/#comment44504>

    It seems that boolean is more appropriate data type for this variable.



execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
<https://reviews.apache.org/r/11588/#comment44694>

    It seems that the code here will firstly recursively traverse through the 
file system to find all directories, store them in memory and then iterate over 
the stored directories to get list of files. Wouldn't be faster and less memory 
intensive to do both passes at the same time? E.g. traverse though the 
filesystem and build directly the list of files?



execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
<https://reviews.apache.org/r/11588/#comment44695>

    Similarly as above, would be simple and more effective to calculate the 
size of all files directly rather than building list of all directories in 
memory first?


Jarcec

- Jarek Cecho


On June 4, 2013, 11:58 a.m., vasanthkumar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11588/
> -----------------------------------------------------------
> 
> (Updated June 4, 2013, 11:58 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> 
> Introducing new option for exporting the sub directories of the given input 
> directory.
> 
> Added new input 
> Input configuration
> 
> Input directory: /input
> Recursive: 
>   0 : NO
>   1 : YES
> Choose: 0
> 
> 
> This addresses bug SQOOP-1063.
>     https://issues.apache.org/jira/browse/SQOOP-1063
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/sqoop/framework/configuration/InputForm.java 
> d5cbeec 
>   core/src/main/resources/framework-resources.properties cebc90e 
>   docs/src/site/sphinx/ClientAPI.rst 4f3fda6 
>   
> execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
>  b3590dc 
>   execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java 
> b3e6050 
>   
> submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java
>  0e8c9f7 
> 
> Diff: https://reviews.apache.org/r/11588/diff/
> 
> 
> Testing
> -------
> 
> Done
> 
> 
> Thanks,
> 
> vasanthkumar
> 
>

Reply via email to