> On March 29, 2013, 7:03 p.m., Venkat Ranganathan wrote: > > Hi Ahmed > > > > Thanks for the new patch. It looks good. I still have one issue and > > suggestion. The powershell script to generate the jar file is very good! > > You are generating a jar file everytime and the jar file is generated under > > SQOOP_HOME. There may be installations for the SQOOP_HOME may not be > > writable by user. Also, I think the main motivation is to overcome the > > environment strings limitation. Since JDK 1.6, Java has the ability to > > provide an option to provide a shortcut for all jars in a file (This > > probably should be done for the Unix classpaths also). Please see > > http://docs.oracle.com/javase/6/docs/technotes/tools/windows/classpath.html > > > > > > I am thinking whether this should be a simpler change to just add all jars > > in SQOOP_LIB. We have to say %SQOOP_HOME%\lib\*. Of course, this > > introduces dependency on 1.6+ versions of JDK, but given that 1.5 is EOLed > > this should be OK > > > > Thanks > > Ahmed El Baz wrote: > Thank you a lot Venkat for the valuable comments, > > I have considered the wildcard option, however, there are some > limitations why it was not preferable to go this route, and using the > referencing jar would give more flexibility: > 1) The need to specify particular jars to include, or exclude some jars > and not include all jars by default in a dorectory by using wildcard. For > example, in configure-sqoop a list of dependency jars for HBase are returned > by invoking "hbase classpath" which returns a list of jars. In this case > using a wrapper Jar releases us from worrying about the length of jars > returned, and it is not possible to use the * in this case, unless we do some > logic to get common dirs. > 2) As you can see also in configure-jar, Sqoop has dependency on other > components rather than just SQOOP_HOME\lib, like HBase, SQOOP_CONF, ZOOCFGDIR. > 3) Using the wrapper jar would scale regardless of how many directories > we include. I understand it is hard the number of folders increases to the > limit where we see the long command error, but even in this case the wrapper > jar would work just fine. > > I would like to unederstand more about scenarios where we anticipate > SQOOP_HOME would not be writable on Windows systems. > > Thank you again, > Ahmed > > Venkat Ranganathan wrote: > Thanks Ahmed for the explanation. > > I thought we are primarily limited by the 8K limit in the command line so > if we can potentially limit the large jar file dirs in this format, then it > would be fit within the limit. > Good point of hbase -classpath option. May be we can have improvement on > Hbase to return the hbase classpath with jar dirs properly added > > For example, when people install Hadoop on Windows and decide that Hadoop > stack will be installed under a terminal server and this is shared across > multiple users - or it may be installed in a common location and mapped > based on logon scripts. And the directory can become inaccessible for > people running sqoop jobs. This is a scheme used by some Hadoop > distributions today. > > Thanks > > > Venkat Ranganathan wrote: > I had this comment written befoe, but got caught up in the saved reviews > instead of publishing. Sorry about that. Can you check my comments and can > we simplify this
Thank you Venkatesh, I have update the patch to use the jar dirs for classpath locations, rather than the powershell script to generate a single jar encapsulating the classpath in its manifest. As discussed, we will need to have a corresponding change for the HBASE case where hbase.cmd -classpath is invoked to return a list of jar files. For now we use HBASE_HOME and HBASE_HOME\lib in the case of Windows. Thanks, Ahmed - Ahmed ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10055/#review18523 ----------------------------------------------------------- On April 22, 2013, 3:26 a.m., Ahmed El Baz wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/10055/ > ----------------------------------------------------------- > > (Updated April 22, 2013, 3:26 a.m.) > > > Review request for Sqoop. > > > Description > ------- > > A patch implementing the Windows version of Sqoop run scripts. The scripts > follow the same logic as there .sh counterparts. > One difference is to create a Jar which references all classpath elements in > its Manifest, and provide that jar as the single jar needed for Sqoop. The > reason here is that in some cases if the number of classpath elements is > large, HADOOP_CLASSPATH gets very long which causes failures in Windows since > there is a limit to command lines. > As a workaround, I added a step to wrap all jars in the classpath in a single > jar, and then use that generated jar (this is also done in hadoop for Windows > to handle similar issues) > I did this in a utility script "BuildJar" which can be used for other > components as well. > This change is specific to Windows scripts, Linux scripts are not affected. > > > This addresses bug SQOOP-954. > https://issues.apache.org/jira/browse/SQOOP-954 > > > Diffs > ----- > > bin/configure-sqoop.cmd PRE-CREATION > bin/sqoop.cmd PRE-CREATION > conf/sqoop-env-template.cmd PRE-CREATION > > Diff: https://reviews.apache.org/r/10055/diff/ > > > Testing > ------- > > > Thanks, > > Ahmed El Baz > >
