Hi Victor, Thanks for the detailed examination. I will make sure to remove the URI prefix in my code for now.
Regards, Eric On 1/20/10 5:36 AM, "Victor Hsieh" <victorhs...@gmail.com> wrote: > BTW, this issue has been reported: > http://issues.apache.org/jira/browse/MAPREDUCE-752 > > On Wed, Jan 20, 2010 at 7:59 PM, Victor Hsieh <victorhs...@gmail.com> wrote: >> Hi Eirc, >> >> (I was new to this mailing list, so I don't have the original email to >> reply directly.) >> >> I have exact the same problem today, and finally found the reason. >> >> In our case, we add some URI to DistributedCache like you. But >> unfortunately the problem was the URI. When we tried to add several >> jars by calling addFileToClassPath, these files are actually joined by >> colons, which is the default path separator in java classpath. And >> this is the reason of failure. >> >> For example, if you have hdfs://example.com:9000/a.jar and >> hdfs://example.com:9000/b.jar to add to classpath, your >> mapred.job.classpath.files will look like (note these colons!): >> >> dfs://example.com:9000/a.jar:hdfs://example.com:9000/b.jar >> >> Then when a worker tries to add them to the classpath (search >> getFileClassPaths in org.apache.hadoop.mapred.TaskRunner.java), it >> actually adds "dfs", "//example.com", "9000/a.jar", and so on, which >> is not desired. >> >> Our solution is to remove "hdfs://example.com:9000" part when calling >> addFileToClassPath. Hope it helps! >> >> Victor >>