addArchiveToClassPath doesn't work in 0.18.x branch
---------------------------------------------------

                 Key: HADOOP-6140
                 URL: https://issues.apache.org/jira/browse/HADOOP-6140
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs
    Affects Versions: 0.18.3
            Reporter: Vladimir Klimontovich
            Priority: Minor


addArchiveToClassPath is a method of DistributedCache class. It should be 
called before running a task. It accepts path to a jar file on a DFS. After it
this method should put this jar file on sitribuuted cache and than add this 
file to classpath to each map/reduce process on job tracker.

This method didn't work.

Bug 1:

addArchiveToClassPath adds DFS-path to archive to mapred.job.classpath.archives 
property. It uses System.getProperty("path.separator") as delimiter of multiple 
path.

getFileClassPaths that is called from TaskRunner uses splits 
mapred.job.classpath.archives using System.getProperty("path.separator").

In unix systems System.getProperty("path.separator") equals to ":". DFS-path 
urls is hdfs://host:port/path. It means that a result of split will be
[ hdfs,//host,port/path].

Suggested solution: use "," instead of  

Bug 2:

in TaskRunner there is an algorithm that looks for correspondence between DFS 
paths and local paths in distributed cache. 
It compares

   if (archives[i].getPath().equals(
                                               
archiveClasspaths[j].toString())){

instead of

    if (archives[i].toString().equals(
                                               
archiveClasspaths[j].toString())) 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to