[ 
https://issues.apache.org/jira/browse/OOZIE-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883821#comment-13883821
 ] 

Satish Mittal commented on OOZIE-1381:
--------------------------------------

Hi Ryota, I am seeing this issue with 0.20.2-CDH3u5 hadoop cluster and 
oozie-4.0.0. There is a difference in the implementation of 
DistributedCache.addFileToClassPath() in these versions.

In 0.20.2-cdh3u5:
{code}
public static void addFileToClassPath
           (Path file, Configuration conf, FileSystem fs)
        throws IOException {
    String classpath = conf.get("mapred.job.classpath.files");
    conf.set("mapred.job.classpath.files", classpath == null ? file.toString()
             : classpath
                 + System.getProperty("path.separator") + file.toString());
    URI uri = fs.makeQualified(file).toUri();
    addCacheFile(uri, conf);
  }
{code}

Whereas, in hadoop 1.0/1.1.1/1.2.1, the logic is:
{code}
 public static void addFileToClassPath
           (Path file, Configuration conf, FileSystem fs)
        throws IOException {
    String filepath = file.toUri().getPath();
    String classpath = conf.get("mapred.job.classpath.files");
    conf.set("mapred.job.classpath.files", classpath == null
        ? filepath
        : classpath + System.getProperty("path.separator") + filepath);
    URI uri = fs.makeQualified(file).toUri();
    addCacheFile(uri, conf);
  }
{code}

As you can see, the first line converts Path to Path.toURI.getPath, which will 
basically convert any absolute URI to just URI's path component and make oozie 
work in your case.

As per svn log, the above change comes from:
{quote}
------------------------------------------------------------------------
r1077790 | omalley | 2011-03-04 10:25:44 +0530 (Fri, 04 Mar 2011) | 7 lines

commit da37613ea207129453cb7c2ba4e761ed50b020d7
Author: Krishna Ramachandran <[email protected]>
Date:   Fri Feb 11 05:41:19 2011 -0800

    Fix 4274823 "Distributed Cache is not adding files to class paths"
    Contributed by Chris Douglas
{quote}

This issue has been later resolved in 
https://issues.apache.org/jira/browse/HADOOP-4864, where the fix is basically 
to use ',' as the classpath separator, instead of "path.separator". As per 
JIRA, this fix is available in 0.21.0.

> Oozie does not support access to the distributed cache file under different 
> name node 
> --------------------------------------------------------------------------------------
>
>                 Key: OOZIE-1381
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1381
>             Project: Oozie
>          Issue Type: Bug
>    Affects Versions: trunk
>            Reporter: Ryota Egashira
>            Assignee: Ryota Egashira
>             Fix For: 4.0.0
>
>         Attachments: OOZIE-1381-v2.patch, OOZIE-1381-v5.patch
>
>
> suppose app path on name node NN1, and user wants to dist-cache file located 
> on different name node, NN2. when user specify something like 
> "<file>hdfs://nn2_address:8020/target_file</file>", it doesn't work due to 
> current JavaActionExecutor.addToCache logic, which extracts file path and 
> prepends NN1.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to