-libjars with multiple jars broken when client and cluster reside on different 
OSs
----------------------------------------------------------------------------------

                 Key: HADOOP-4864
                 URL: https://issues.apache.org/jira/browse/HADOOP-4864
             Project: Hadoop Core
          Issue Type: Bug
          Components: filecache
    Affects Versions: 0.19.0
         Environment: When your hadoop job spans OSs.
            Reporter: Stuart White
            Priority: Minor


When submitting a hadoop job from Windows (Cygwin) to a Linux hadoop cluster 
(or vice versa), and when you specify multiple additional jar files via the 
-libjars flag, hadoop throws a ClassNotFoundException for any classes located 
in the additional jars specified via the -libjars flag.

This is caused by the fact that hadoop uses 
system.getProperty("path.separator") as the delimiter in the list of jar files 
passed via -libjars.

If your job spans platforms, system.getProperty("path.separator") returns a 
different delimiter on the different platforms.

My suggested solution is to use a comma as the delimiter, rather than the 
path.separator.

I realize comma is, perhaps, a poor choice for a delimiter because it is valid 
in filenames on both Windows and Linux, but the -libjars flag uses it as the 
delimiter when listing the additional required jars.  So, I figured if it's 
already being used as a delimiter, then it's reasonable to use it internally as 
well.

I have a patch that applied my suggested change, but I don't see anywhere so 
upload it.  So, I'll go ahead and create this JIRA and hope that I will have 
the opportunity to add a patch later.

Now, with this change, I can submit hadoop jobs (requiring multiple
supporting jars) from my Windows laptop (via cygwin) to my 10-node
Linux hadoop cluster.

Any chance this change could be applied to the hadoop codebase?

To recreate the problem I'm seeing, do the following:

- Setup a hadoop cluster on linux

- Perform the remaining steps on cygwin, with a hadoop installation
configured to point to the linux cluster.  (set fs.default.name and
mapred.job.tracker)

- Extract the tarball.  Change into created directory.
 tar xvfz Example.tar.gz
 cd Example

- Edit build.properties, set your hadoop.home appropriately, then
build the example.
 ant

- Load the file Example.in into your dfs
 hadoop dfs -copyFromLocal Example.in Example.in

- Execute the provided shell script, passing it testID 1.
 ./Example.sh 1
 This test does not use -libjars, and it completes successfully.

- Next, execute testID 2.
 ./Example.sh 2
 This test uses -libjars with 1 jarfile (Foo.jar), and it completes
successfully.

- Next, execute testID 3.
 ./Example.sh 3
 This test uses -libjars with 1 jarfile (Bar.jar), and it completes
successfully.

- Next, execute testID 4.
 ./Example.sh 4
 This test uses -libjars with 2 jarfiles (Foo.jar and Bar.jar), and
it fails with a ClassNotFoundException.

This behavior only occurs when calling from cygwin to linux or vice
versa.   If both the cluster and the client reside on either linux or
cygwin, the problem does not occur.

I'm continuing to dig to see what I can figure out, but since I'm very
new to hadoop (started using it this week), I thought I'd go ahead and
throw this out there to see if anyone can help.

Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to