Angus Davis created YARN-6725:
---------------------------------

             Summary: NodeManager cannot localize files when URL contains 
opaque authority
                 Key: YARN-6725
                 URL: https://issues.apache.org/jira/browse/YARN-6725
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn
    Affects Versions: 2.8.1
            Reporter: Angus Davis
            Priority: Minor


When presented with a URI or URL that has a valid structure, but where the URI 
/ URL authority cannot be interpreted as a valid hostname, the java URI#getHost 
method will return null. In the YARN api records the URL message is decomposed 
into hostname, username, port, etc, but it does not provide a means to 
transport an opaque authority. This becomes problematic when using object 
stores or other file systems that use something other than a hostname as an 
authority. 

In our particular case, Google Cloud Storage allows buckets to be named with 
underscores which is causing the NodeManagers to attempt to localize resources 
of the form 'gs:///path/to/object' when they should be attempting to localize 
'gs://bucket/path/to/object'. Components that are written in terms of 
o.a.h.fs.Path instances seem to handle this case properly, as do most Path 
utility methods (toUri, etc).

It seems that by transporting the authority along with the host, user info, 
etc, a YARN api URL can properly represent o.a.h.fs.Path instances of this 
particular form. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to