[
https://issues.apache.org/jira/browse/HADOOP-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729866#action_12729866
]
Vladimir Klimontovich commented on HADOOP-6140:
-----------------------------------------------
Philip,
I tried to create junit test for (2), but I met some difficulties. Seems that I
need to start whole cluster from junit test (jobtracker, tasktracker) to
test running job with additional classpath in distributed cache. I'm not
familiar with hadoop code, but maybe you or someone could point to the test
that I could use as example.
Or there is another option. I can extract piece of code with bug to separate
method and test it separately. But I'm not sure it's reasonable to in in 0.18
branch.
Also, I noticed that a comma bug is already fixed in trunk.
> DistributedCache.addArchiveToClassPath doesn't work in 0.18.x branch
> --------------------------------------------------------------------
>
> Key: HADOOP-6140
> URL: https://issues.apache.org/jira/browse/HADOOP-6140
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 0.18.3
> Reporter: Vladimir Klimontovich
> Attachments: HADOOP-6140.patch
>
>
> addArchiveToClassPath is a method of DistributedCache class. It should be
> called before running a task. It accepts path to a jar file on a DFS. After it
> this method should put this jar file on sitribuuted cache and than add this
> file to classpath to each map/reduce process on job tracker.
> This method didn't work.
> Bug 1:
> addArchiveToClassPath adds DFS-path to archive to
> mapred.job.classpath.archives property. It uses
> System.getProperty("path.separator") as delimiter of multiple path.
> getFileClassPaths that is called from TaskRunner uses splits
> mapred.job.classpath.archives using System.getProperty("path.separator").
> In unix systems System.getProperty("path.separator") equals to ":". DFS-path
> urls is hdfs://host:port/path. It means that a result of split will be
> [ hdfs,//host,port/path].
> Suggested solution: use "," instead of
> Bug 2:
> in TaskRunner there is an algorithm that looks for correspondence between DFS
> paths and local paths in distributed cache.
> It compares
> if (archives[i].getPath().equals(
>
> archiveClasspaths[j].toString())){
> instead of
> if (archives[i].toString().equals(
>
> archiveClasspaths[j].toString()))
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.