[jira] Updated: (HADOOP-4864) -libjars with multiple jars broken when client and cluster reside on different OSs

Stuart White (JIRA) Fri, 12 Dec 2008 19:43:10 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stuart White updated HADOOP-4864:
---------------------------------

    Attachment: patch.txt

Patch that changes Hadoop's internal delimiter for list of jars specified via 
-libjars from using System.getProperty("path.separator") to using a comma.

This is because path.separator is platform-specific and therefore does not 
serve as an appropriate delimiter across platforms.

> -libjars with multiple jars broken when client and cluster reside on 
> different OSs
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-4864
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4864
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: filecache
>    Affects Versions: 0.19.0
>         Environment: When your hadoop job spans OSs.
>            Reporter: Stuart White
>            Priority: Minor
>         Attachments: patch.txt
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When submitting a hadoop job from Windows (Cygwin) to a Linux hadoop cluster 
> (or vice versa), and when you specify multiple additional jar files via the 
> -libjars flag, hadoop throws a ClassNotFoundException for any classes located 
> in the additional jars specified via the -libjars flag.
> This is caused by the fact that hadoop uses 
> system.getProperty("path.separator") as the delimiter in the list of jar 
> files passed via -libjars.
> If your job spans platforms, system.getProperty("path.separator") returns a 
> different delimiter on the different platforms.
> My suggested solution is to use a comma as the delimiter, rather than the 
> path.separator.
> I realize comma is, perhaps, a poor choice for a delimiter because it is 
> valid in filenames on both Windows and Linux, but the -libjars flag uses it 
> as the delimiter when listing the additional required jars.  So, I figured if 
> it's already being used as a delimiter, then it's reasonable to use it 
> internally as well.
> I have a patch that applied my suggested change, but I don't see anywhere so 
> upload it.  So, I'll go ahead and create this JIRA and hope that I will have 
> the opportunity to add a patch later.
> Now, with this change, I can submit hadoop jobs (requiring multiple
> supporting jars) from my Windows laptop (via cygwin) to my 10-node
> Linux hadoop cluster.
> Any chance this change could be applied to the hadoop codebase?
> To recreate the problem I'm seeing, do the following:
> - Setup a hadoop cluster on linux
> - Perform the remaining steps on cygwin, with a hadoop installation
> configured to point to the linux cluster.  (set fs.default.name and
> mapred.job.tracker)
> - Extract the tarball.  Change into created directory.
>  tar xvfz Example.tar.gz
>  cd Example
> - Edit build.properties, set your hadoop.home appropriately, then
> build the example.
>  ant
> - Load the file Example.in into your dfs
>  hadoop dfs -copyFromLocal Example.in Example.in
> - Execute the provided shell script, passing it testID 1.
>  ./Example.sh 1
>  This test does not use -libjars, and it completes successfully.
> - Next, execute testID 2.
>  ./Example.sh 2
>  This test uses -libjars with 1 jarfile (Foo.jar), and it completes
> successfully.
> - Next, execute testID 3.
>  ./Example.sh 3
>  This test uses -libjars with 1 jarfile (Bar.jar), and it completes
> successfully.
> - Next, execute testID 4.
>  ./Example.sh 4
>  This test uses -libjars with 2 jarfiles (Foo.jar and Bar.jar), and
> it fails with a ClassNotFoundException.
> This behavior only occurs when calling from cygwin to linux or vice
> versa.   If both the cluster and the client reside on either linux or
> cygwin, the problem does not occur.
> I'm continuing to dig to see what I can figure out, but since I'm very
> new to hadoop (started using it this week), I thought I'd go ahead and
> throw this out there to see if anyone can help.
> Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4864) -libjars with multiple jars broken when client and cluster reside on different OSs

Reply via email to