> On March 29, 2013, 7:03 p.m., Venkat Ranganathan wrote:
> > Hi Ahmed
> > 
> > Thanks for the new patch.  It looks good.  I still have one issue and 
> > suggestion.  The powershell script to generate the jar file is very good!  
> > You are generating a jar file everytime and the jar file is generated under 
> > SQOOP_HOME.   There may be installations for the SQOOP_HOME may not be 
> > writable by user.   Also, I think the main motivation is to overcome the 
> > environment strings limitation.   Since JDK 1.6, Java has the ability to 
> > provide an option to provide a shortcut for all jars in a file (This 
> > probably should be done for the Unix classpaths also).   Please see 
> > http://docs.oracle.com/javase/6/docs/technotes/tools/windows/classpath.html 
> >  
> > 
> > I am thinking whether this should be a simpler change to just add all jars 
> > in SQOOP_LIB.  We have to say %SQOOP_HOME%\lib\*.   Of course, this 
> > introduces dependency on 1.6+ versions of JDK, but given that 1.5 is EOLed 
> > this should be OK
> > 
> > Thanks
> 
> Ahmed El Baz wrote:
>     Thank you a lot Venkat for the valuable comments,
>     
>     I have considered the wildcard option, however, there are some 
> limitations why it was not preferable to go this route, and using the 
> referencing jar would give more flexibility:
>     1) The need to specify particular jars to include, or exclude some jars 
> and not include all jars by default in a dorectory by using wildcard. For 
> example, in configure-sqoop a list of dependency jars for HBase are returned 
> by invoking "hbase classpath" which returns a list of jars. In this case 
> using a wrapper Jar releases us from worrying about the length of jars 
> returned, and it is not possible to use the * in this case, unless we do some 
> logic to get common dirs.
>     2) As you can see also in configure-jar, Sqoop has dependency on other 
> components rather than just SQOOP_HOME\lib, like HBase, SQOOP_CONF, ZOOCFGDIR.
>     3) Using the wrapper jar would scale regardless of how many directories 
> we include. I understand it is hard the number of folders increases to the 
> limit where we see the long command error, but even in this case the wrapper 
> jar would work just fine.
>     
>     I would like to unederstand more about scenarios where we anticipate 
> SQOOP_HOME would not be writable on Windows systems.
>     
>     Thank you again,
>     Ahmed

Thanks Ahmed for the explanation.

I thought we are primarily limited by the 8K limit in the command line so if we 
can potentially limit the large jar file dirs in this format, then it would be 
fit within the limit.
Good point of hbase -classpath option.  May be we can have improvement on Hbase 
to return the hbase classpath with jar dirs properly added

For example, when people install Hadoop on Windows and decide that Hadoop stack 
will be installed under a terminal server and this is shared across multiple 
users - or it  may be installed in a common location and mapped based on logon 
scripts.   And the directory can become inaccessible for people running sqoop 
jobs.   This is a scheme used by some  Hadoop distributions today.

Thanks


- Venkat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10055/#review18523
-----------------------------------------------------------


On March 29, 2013, 3:01 a.m., Ahmed El Baz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10055/
> -----------------------------------------------------------
> 
> (Updated March 29, 2013, 3:01 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> A patch implementing the Windows version of Sqoop run scripts. The scripts 
> follow the same logic as there .sh counterparts.
> One difference is to create a Jar which references all classpath elements in 
> its Manifest, and provide that jar as the single jar needed for Sqoop. The 
> reason here is that in some cases if the number of classpath elements is 
> large, HADOOP_CLASSPATH gets very long which causes failures in Windows since 
> there is a limit to command lines.
> As a workaround, I added a step to wrap all jars in the classpath in a single 
> jar, and then use that generated jar (this is also done in hadoop for Windows 
> to handle similar issues)
> I did this in a utility script "BuildJar" which can be used for other 
> components as well.
> This change is specific to Windows scripts, Linux scripts are not affected.
> 
> 
> This addresses bug SQOOP-954.
>     https://issues.apache.org/jira/browse/SQOOP-954
> 
> 
> Diffs
> -----
> 
>   bin/BuildJar.ps1 PRE-CREATION 
>   bin/configure-sqoop.cmd PRE-CREATION 
>   bin/sqoop.cmd PRE-CREATION 
>   conf/sqoop-env-template.cmd PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10055/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ahmed El Baz
> 
>

Reply via email to