[ 
https://issues.apache.org/jira/browse/HADOOP-10523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998555#comment-13998555
 ] 

Mohammad Kamrul Islam commented on HADOOP-10523:
------------------------------------------------

Very good explanations!
Mostly agreed with the following comments:

> I think the better solution is for users to not cancel tokens. Tokens are 
> supposed to be an "invisible" implementation detail of job submission and 
> thus not require user manipulation.

I was told every (delegation) token creates an overhead on the process  memory. 
So try to close it early. If that thinking is changed, i'm open for any option. 
Btw, long running process like Azkaban explicitly cancels its delegation  token.

> I'd suggest modifying the RM to either swallow the cancel error on job 
> completion, or to simply emit a single line in the log instead of a backtrace.

So this will be added as a WARN message in the caller of cancelToken(). It 
includes RM, JHS and NN. right? Can you please give little more details about " 
on job completion"?



> Hadoop services (such as RM, NN and JHS) throw confusing exception during 
> token auto-cancelation 
> -------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10523
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10523
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 2.3.0
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mohammad Kamrul Islam
>             Fix For: 2.5.0
>
>         Attachments: HADOOP-10523.1.patch
>
>
> When a user explicitly cancels the token, the system (such as RM, NN and JHS) 
> also periodically tries to cancel the same token. During the second cancel 
> (originated by RM/NN/JHS), Hadoop processes throw the following 
> error/exception in the  log file. Although the exception is harmless, it 
> creates a lot of confusions and causes the dev to spend a lot of time to 
> investigate.
> This JIRA is to make sure if the token is available/not cancelled before 
> attempting to cancel the token and  finally replace this exception with 
> proper warning message.
> {noformat}
> 2014-04-15 01:41:14,686 INFO 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  Token cancelation requested for identifier:: 
> owner=<FULL_PRINCIPAL>.linkedin.com@REALM, renewer=yarn, realUser=, 
> issueDate=1397525405921, maxDate=1398130205921, sequenceNumber=1, 
> masterKeyId=2
> 2014-04-15 01:41:14,688 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:yarn/HOST@<REALM> (auth:KERBEROS) 
> cause:org.apache.hadoop.security.token.SecretManager$InvalidToken: Token not 
> found
> 2014-04-15 01:41:14,689 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 7 on 10020, call 
> org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB.cancelDelegationToken 
> from 172.20.128.42:2783 Call#37759 Retry#0: error: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Token not found
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Token not found
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.cancelToken(AbstractDelegationTokenSecretManager.java:436)
>         at 
> org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.cancelDelegationToken(HistoryClientService.java:400)
>         at 
> org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.cancelDelegationToken(MRClientProtocolPBServiceImpl.java:286)
>         at 
> org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:301)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1958)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1956)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to