Ethan Setnik created SAMZA-171:
----------------------------------

             Summary: Http Yarn package path failing
                 Key: SAMZA-171
                 URL: https://issues.apache.org/jira/browse/SAMZA-171
             Project: Samza
          Issue Type: Bug
            Reporter: Ethan Setnik


When specifying an http package path for yarn jobs the jobs fail with the error:

14/03/05 16:28:40 WARN security.UserGroupInformation: 
PriviledgedActionException as:samza (auth:SIMPLE) cause:java.io.IOException: No 
FileSystem for scheme: http
14/03/05 16:28:40 INFO localizer.ResourceLocalizationService: DEBUG: FAILED { 
http://s3.amazonaws.com/samza_packages/wikipedia-job-package.tar.gz, 0, 
ARCHIVE, null }, No FileSystem for scheme: http
14/03/05 16:28:40 INFO localizer.LocalizedResource: Resource 
http://s3.amazonaws.com/samza_packages/wikipedia-job-package.tar.gz 
transitioned from DOWNLOADING to FAILED
14/03/05 16:28:40 INFO container.Container: Container 
container_1394035672475_0003_02_000001 transitioned from LOCALIZING to 
LOCALIZATION_FAILED
14/03/05 16:28:40 INFO localizer.LocalResourcesTrackerImpl: Container 
container_1394035672475_0003_02_000001 sent RELEASE event on a resource request 
{ http://s3.amazonaws.com/samza_packages/wikipedia-job-package.tar.gz, 0, 
ARCHIVE, null } not present in cache.

yarn.package.path=http://s3.amazonaws.com/samza_packages/wikipedia-job-package.tar.gz

It looks like some work has already been done to support this feature by 
configuring the "fs.http.imp".  I also noticed that this configuration was 
updated in SAMZA-63.

hConfig.set("fs.http.impl", classOf[HttpFileSystem].getName)

However my understanding is that the job package itself contains the necessary 
HttpFileSystem class to load http packages, and YARN does not support this 
configuration out of the box so i'm at a loss as to how to load a remote 
package over http.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to