[ 
https://issues.apache.org/jira/browse/YARN-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-2426:
-----------------------------------

    Assignee: Varun Vasudev

> NodeManger is not able use WebHDFS token properly to tallk to WebHDFS while 
> localizing 
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-2426
>                 URL: https://issues.apache.org/jira/browse/YARN-2426
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager, resourcemanager, webapp
>    Affects Versions: 2.6.0
>         Environment: Hadoop Keberos (Secure) cluster with 
> LinuxContainerExcutor is enabled
> With SPNEGO on for Yarn new RM web services for application submission
> While using kinit we are using -C (to specify cachepath).
> Then while executing set export KRB5CCNAME = <path provided with -C option>
> There is no kerberos ticket in default KRB5 cache path with is /tmp
>            Reporter: Karam Singh
>            Assignee: Varun Vasudev
>
> Encountered this issue during using new YARN's RM WS for application 
> submission, on single node cluster while submitting Distributed Shell 
> application using RM WS(webservice).
> For this we need  pass custom script and AppMaster jar along with webhdfs 
> token to NodeManager for localization.
> Distributed Shell Application was failing as Node was failing to localise 
> AppMaster jar .
> Following is the NM log while localizing AppMaster jar:
> {code}
> 2014-08-18 01:53:52,434 INFO  authorize.ServiceAuthorizationManager 
> (ServiceAuthorizationManager.java:authorize(114)) - Authorization successful 
> for testing (auth:TOKEN) for protocol=interface 
> org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-08-18 01:53:52,757 INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:update(1011)) - DEBUG: FAILED { 
> webhdfs://<NAMENODEHOST>:<NAMENODEHTTPPORT>/user/<JARpPATH>, 1408352019488, 
> FILE, null }, Authentication required
> 2014-08-18 01:53:52,758 INFO  localizer.LocalizedResource 
> (LocalizedResource.java:handle(203)) - Resource 
> webhdfs://<NAMENODEHOST>:<NAMENODEHTTPPORT>/user/<JARPATH>(-><NM_LOCAL_DIR>/usercache/<APP_USER>/appcache/application_1408351986532_0001/filecache/10/DshellAppMaster.jar)
>  transitioned from DOWNLOADING to FAILED
> 2014-08-18 01:53:52,758 INFO  container.Container 
> (ContainerImpl.java:handle(999)) - Container 
> container_1408351986532_0001_01_000001 transitioned from LOCALIZING to 
> LOCALIZATION_FAILED
> {code}  
> Which is similar to what we get is when we try access webhdfs in secure 
> (kerberos) cluster without doing kinit
> Whereas if we do curl -i -k -s 
> 'http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>?op=listStatus&delegation=<same
>  webhdfs token used in app submission structure>"
> works properly
> I also tried using 
> http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/hadoopqa/<JAR_PATH> 
> in app submission object instead of webhdfs:// uri format
> Then NodeManger fail to localize as there is http filesystem scheme
> {code}
> 14-08-18 02:03:31,343 INFO  authorize.ServiceAuthorizationManager 
> (ServiceAuthorizationManager.java:authorize(114)) - Authorization successful 
> for testing (auth:TOKEN) for protocol=interface org.apache.
> hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB
> 2014-08-18 02:03:31,583 INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:update(1011)) - DEBUG: FAILED { 
> http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH> 
> 1408352576841, FILE, null }, No FileSystem for scheme: http
> 2014-08-18 02:03:31,583 INFO  localizer.LocalizedResource 
> (LocalizedResource.java:handle(203)) - Resource 
> http://<NAMENODEHOST>:<NAMENODEHTTPPORT>/webhdfs/v1/user/<JAR_PATH>(-><NM_LOCAL_DIR>/usercache/<APP_USER>/appcache/application_1408352544163_0002/filecache/11/DshellAppMaster.jar)
>  transitioned from DOWNLOADING to FAILED
> {code}
> Now do kinit without providing -C option for KRB5 cache path. So Ticket to 
> goes to default KRB5 cache /tmp
> Again submit same application object to Yarn WS, with webhdfs:// uri format 
> paths and webhdfs token
> This time NM is able download jar and custom shell script and application 
> runs fine
> Looks like following is happening:
> webhdfs is trying look for krb ticket in NM while localising 
> 1. As 1st case there was to krb ticket there in default cache. Application 
> failing while localising AppMaster jar
> 2. In second case as already kinit and krb ticket was present in /tmp 
> (default KRB5 cache). AppMaster got localized successfully



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to