> On March 29, 2013, 7:03 p.m., Venkat Ranganathan wrote:
> > Hi Ahmed
> > 
> > Thanks for the new patch.  It looks good.  I still have one issue and 
> > suggestion.  The powershell script to generate the jar file is very good!  
> > You are generating a jar file everytime and the jar file is generated under 
> > SQOOP_HOME.   There may be installations for the SQOOP_HOME may not be 
> > writable by user.   Also, I think the main motivation is to overcome the 
> > environment strings limitation.   Since JDK 1.6, Java has the ability to 
> > provide an option to provide a shortcut for all jars in a file (This 
> > probably should be done for the Unix classpaths also).   Please see 
> > http://docs.oracle.com/javase/6/docs/technotes/tools/windows/classpath.html 
> >  
> > 
> > I am thinking whether this should be a simpler change to just add all jars 
> > in SQOOP_LIB.  We have to say %SQOOP_HOME%\lib\*.   Of course, this 
> > introduces dependency on 1.6+ versions of JDK, but given that 1.5 is EOLed 
> > this should be OK
> > 
> > Thanks
> 
> Ahmed El Baz wrote:
>     Thank you a lot Venkat for the valuable comments,
>     
>     I have considered the wildcard option, however, there are some 
> limitations why it was not preferable to go this route, and using the 
> referencing jar would give more flexibility:
>     1) The need to specify particular jars to include, or exclude some jars 
> and not include all jars by default in a dorectory by using wildcard. For 
> example, in configure-sqoop a list of dependency jars for HBase are returned 
> by invoking "hbase classpath" which returns a list of jars. In this case 
> using a wrapper Jar releases us from worrying about the length of jars 
> returned, and it is not possible to use the * in this case, unless we do some 
> logic to get common dirs.
>     2) As you can see also in configure-jar, Sqoop has dependency on other 
> components rather than just SQOOP_HOME\lib, like HBase, SQOOP_CONF, ZOOCFGDIR.
>     3) Using the wrapper jar would scale regardless of how many directories 
> we include. I understand it is hard the number of folders increases to the 
> limit where we see the long command error, but even in this case the wrapper 
> jar would work just fine.
>     
>     I would like to unederstand more about scenarios where we anticipate 
> SQOOP_HOME would not be writable on Windows systems.
>     
>     Thank you again,
>     Ahmed
> 
> Venkat Ranganathan wrote:
>     Thanks Ahmed for the explanation.
>     
>     I thought we are primarily limited by the 8K limit in the command line so 
> if we can potentially limit the large jar file dirs in this format, then it 
> would be fit within the limit.
>     Good point of hbase -classpath option.  May be we can have improvement on 
> Hbase to return the hbase classpath with jar dirs properly added
>     
>     For example, when people install Hadoop on Windows and decide that Hadoop 
> stack will be installed under a terminal server and this is shared across 
> multiple users - or it  may be installed in a common location and mapped 
> based on logon scripts.   And the directory can become inaccessible for 
> people running sqoop jobs.   This is a scheme used by some  Hadoop 
> distributions today.
>     
>     Thanks
>
> 
> Venkat Ranganathan wrote:
>     I had this comment written befoe, but got caught up in the saved reviews 
> instead of publishing.  Sorry about that.   Can you check my comments and can 
> we simplify this

Thank you Venkatesh,

I have update the patch to use the jar dirs for classpath locations, rather 
than the powershell script to generate a single jar encapsulating the classpath 
in its manifest. As discussed, we will need to have a corresponding change for 
the HBASE case where hbase.cmd -classpath is invoked to return a list of jar 
files. For now we use HBASE_HOME and HBASE_HOME\lib in the case of Windows.

Thanks,
Ahmed


- Ahmed


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10055/#review18523
-----------------------------------------------------------


On April 22, 2013, 3:26 a.m., Ahmed El Baz wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10055/
> -----------------------------------------------------------
> 
> (Updated April 22, 2013, 3:26 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Description
> -------
> 
> A patch implementing the Windows version of Sqoop run scripts. The scripts 
> follow the same logic as there .sh counterparts.
> One difference is to create a Jar which references all classpath elements in 
> its Manifest, and provide that jar as the single jar needed for Sqoop. The 
> reason here is that in some cases if the number of classpath elements is 
> large, HADOOP_CLASSPATH gets very long which causes failures in Windows since 
> there is a limit to command lines.
> As a workaround, I added a step to wrap all jars in the classpath in a single 
> jar, and then use that generated jar (this is also done in hadoop for Windows 
> to handle similar issues)
> I did this in a utility script "BuildJar" which can be used for other 
> components as well.
> This change is specific to Windows scripts, Linux scripts are not affected.
> 
> 
> This addresses bug SQOOP-954.
>     https://issues.apache.org/jira/browse/SQOOP-954
> 
> 
> Diffs
> -----
> 
>   bin/configure-sqoop.cmd PRE-CREATION 
>   bin/sqoop.cmd PRE-CREATION 
>   conf/sqoop-env-template.cmd PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10055/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ahmed El Baz
> 
>

Reply via email to