-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14085/
-----------------------------------------------------------

(Updated Oct. 14, 2013, 9:10 p.m.)


Review request for Sqoop.


Changes
-------

Updated the documentation for the new option --skip-dist-cache


Bugs: SQOOP-1192
    https://issues.apache.org/jira/browse/SQOOP-1192


Repository: sqoop-trunk


Description
-------

Now Sqoop will copy jar files in %SQOOP_HOME%\lib folder to the job cache every 
time a Sqoop job is launched. When Oozie launch a Sqoop job, this behavior can 
be optimized by add these jars in Oozie Sqoop sharelib. In this case, the jar 
files in share lib only needed be localized to each worker node once and reuse 
by all Sqoop job launched by Oozie. This can reduce massive disk I/O on worker 
node when using Sqoop by Oozie. To enable this, Sqoop need to have an option 
which enable the job to skip adding lib jars to the job cache. For now, this 
option should only be used by Oozie started Sqoop job. The patch attached 
introduce "--skip-dist-cache" option to enable this feature.


Diffs (updated)
-----

  src/docs/user/import.txt 71b50d8 
  src/java/org/apache/sqoop/SqoopOptions.java 01805f9 
  src/java/org/apache/sqoop/mapreduce/JobBase.java 322df1c 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java b05f587 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java ebb1857 
  src/test/com/cloudera/sqoop/TestSqoopOptions.java 03e2504 

Diff: https://reviews.apache.org/r/14085/diff/


Testing
-------

Tested the new option with Oozie-Sqoop workflow to ensure it doesn't break 
Sqoop library dependencies when launched by Oozie


Thanks,

Shuaishuai Nie

Reply via email to