It looks like the home directory does not exist but the copy went through.
Can you try to LOG the key fields in destStatus including path? It might be 
ending up in an unexpected place.

Kihwal



On 4/9/12 12:45 PM, "Ralph Castain" <r...@open-mpi.org> wrote:

Hi Bobby

On Apr 9, 2012, at 11:40 AM, Robert Evans wrote:

> What do you mean by relocated some supporting files to HDFS?  How do you 
> relocate them?  What API do you use?

I use the LocalResource and FileSystem classes to do the relocation, per the 
Hadoop example:

        // set local resources for the application master
        // local files or archives as needed
        // In this scenario, the jar file for the application master is part of 
the local resources
        Map<String, LocalResource> localResources = new HashMap<String, 
LocalResource>();

        LOG.info("Copy openmpi tarball from local filesystem and add to local 
environment");
        // Copy the application master jar to the filesystem
        // Create a local resource to point to the destination jar path
        FileSystem fs;
        FileStatus destStatus;
        try {
            fs = FileSystem.get(conf);
            Path src = new Path(pathOMPItarball);
            String pathSuffix = appName + "/" + appId.getId();
            Path dst = new Path(fs.getHomeDirectory(), pathSuffix);
            try {
                fs.copyFromLocalFile(false, true, src, dst);
                try {
                    destStatus = fs.getFileStatus(dst);
                    LocalResource amJarRsrc = 
Records.newRecord(LocalResource.class);

                    // Set the type of resource - file or archive
                    // archives are untarred at destination
                    amJarRsrc.setType(LocalResourceType.ARCHIVE);
                    // Set visibility of the resource
                    // Setting to most private option
                    
amJarRsrc.setVisibility(LocalResourceVisibility.APPLICATION);
                    // Set the resource to be copied over
                    
amJarRsrc.setResource(ConverterUtils.getYarnUrlFromPath(dst));
                    // Set timestamp and length of file so that the framework
                    // can do basic sanity checks for the local resource
                    // after it has been copied over to ensure it is the same
                    // resource the client intended to use with the application
                    amJarRsrc.setTimestamp(destStatus.getModificationTime());
                    amJarRsrc.setSize(destStatus.getLen());
                    localResources.put("openmpi",  amJarRsrc);
                } catch (Throwable t) {
                    LOG.fatal("Error on file status", t);
                    System.exit(1);
                }
            } catch (Throwable t) {
                LOG.fatal("Error on copy from local file", t);
                System.exit(1);
            }
        } catch (Throwable t) {
            LOG.fatal("Error getting filesystem configuration", t);
            System.exit(1);
        }

Note that this appears to work fine when the local resource type was "file" - 
at least, I was able to make a simple program work that way. Problem I'm having 
is when I move an archive, which is why I was hoping to look at the HDFS end to 
see what files are present, and in what locations so I can set the paths 
accordingly.

Thanks
Ralph


>
> --Bobby Evans
>
>
> On 4/9/12 11:10 AM, "Ralph Castain" <r...@open-mpi.org> wrote:
>
> Hi folks
>
> I'm trying to develop an AM for the 0.23 branch and running into a problem 
> that I'm having difficulty debugging. My client relocates some supporting 
> files to HDFS, creates the application object for the AM, and submits it to 
> the RM.
>
> The file relocation request doesn't generate an error, so I must assume it 
> succeeded. It would be nice if there was some obvious way to verify that, but 
> I haven't discovered it. Can anyone give me a hint? I tried asking hdfs to 
> -ls, but all I get is that "." doesn't exist. I have no idea where the file 
> would be placed, if it would persist once the job fails, etc.
>
> When the job is submitted, all I get is an "Error 500", which tells me 
> nothing. Reminds me of the old days of 40 years ago when you'd get the 
> dreaded "error 11", which meant anything from a divide by zero to a memory 
> violation. Are there any debug flags I could set that might provide more info?
>
> Thanks
> Ralph
>
>


Reply via email to