[jira] [Commented] (YARN-2565) ResourceManager is fails to start when GenericHistoryService is enabled in secure mode without doing manual kinit as yarn

2014-09-17 Thread Karam Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137204#comment-14137204
 ] 

Karam Singh commented on YARN-2565:
---

Observed that RM fails to start in Secure mode when GenericeHistoryService is 
enabled and ResourceManager is set to use Timeline Store
{code}
yarn.resourcemanager.keytab=RM_HOST
yarn.resourcemanager.principal=RM_PRINCIPAL
yarn.timeline-service.enabled=true
yarn.timeline-service.hostname=ATS_HOST
yarn.timeline-service.address=ATS_HOST:10200
yarn.timeline-service.webapp.address=ATS_HOST:8188
yarn.timeline-service.handler-thread-count=10
yarn.timeline-service.ttl-enable=true
yarn.timeline-service.ttl-ms=60480
yarn.timeline-service.leveldb-timeline-store.path=/tm/timeline
yarn.timeline-service.keytab=ATS_KEYTAB
yarn.timeline-service.principal=ATS_PRINCIPAL
yarn.timeline-service.webapp.spnego-principal=ATS_SPNEGO_PRINICPAL
yarn.timeline-service.webapp.spnego-keytab-file=ATS_SPNEGO_KETAB
yarn.timeline-service.http-authentication.type=kerberos
yarn.timeline-service.http-authentication.kerberos.principal=ATS_SPNEGO_PRINICPAL
yarn.timeline-service.http-authentication.kerberos.keytab=ATS_SPNEGO_KETAB
yarn.timeline-service.generic-application-history.enabled=true
yarn.timeline-service.generic-application-history.store-class=''
yarn.resourcemanager.system-metrics-publisher.enabled=true
yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size=10
{code}

Stop ResoruceManager and Timelineserver
Start Timelineserver. After ATS gets restart successfully.
Start ResourceManager.
RM fails to start with following exception :
{code}
2014-09-15 10:58:57,735 WARN  ipc.Client (Client.java:run(675)) - Exception 
encountered while connecting to the server : javax.security.sasl.SaslException: 
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)]
2014-09-15 10:58:57,740 ERROR 
applicationhistoryservice.FileSystemApplicationHistoryStore 
(FileSystemApplicationHistoryStore.java:serviceInit(132)) - Error when 
initializing FileSystemHistoryStorage
java.io.IOException: Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]; Host Details : local host is: RM_HOST; destination host is: 
NN_HOST:8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1423)
at org.apache.hadoop.ipc.Client.call(Client.java:1372)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:219)
at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:748)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1918)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1105)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1101)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1101)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1413)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.serviceInit(FileSystemApplicationHistoryStore.java:126)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at 
org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.serviceInit(RMApplicationHistoryWriter.java:99)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:490)
at 

[jira] [Commented] (YARN-2565) ResourceManager is fails to start when GenericHistoryService is enabled in secure mode without doing manual kinit as yarn

2014-09-17 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138184#comment-14138184
 ] 

Zhijie Shen commented on YARN-2565:
---

[~karams], I think you've neglected mentioning the config: 
yarn.timeline-service.generic-application-history.enabled. It should be true, 
such that FileSystemApplicationHistoryStore is picked by 
RMApplicationHistoryWriter, which cannot access HDFS correctly in secure mode.

After YARN-2033, when you enable generic history service, you should by default 
pick the new storage stack based on TimelineStore. The problem seems to be that 
the configurations which determine what store is chosen by 
ApplicationHistoryServer and RMApplicationHistoryWriter is not consistent. On 
RMApplicationHistoryWriter side, we should also use 
FileSystemApplicationHistoryStore only when users have explicitly put it in the 
config file.

 ResourceManager is fails to start when GenericHistoryService is enabled in 
 secure mode without doing manual kinit as yarn
 -

 Key: YARN-2565
 URL: https://issues.apache.org/jira/browse/YARN-2565
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, timelineserver
Affects Versions: 2.6.0
 Environment: Secure cluster with ATS (timeline server enabled) and 
 yarn.resourcemanager.system-metrics-publisher.enabled=true
 so that RM can send Application history to Timeline Store
Reporter: Karam Singh
Assignee: Zhijie Shen

 Observed that RM fails to start in Secure mode when GenericeHistoryService is 
 enabled and ResourceManager is set to use Timeline Store



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)