[jira] [Commented] (YARN-2565) ResourceManager is fails to start when GenericHistoryService is enabled in secure mode without doing manual kinit as yarn
[ https://issues.apache.org/jira/browse/YARN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137204#comment-14137204 ] Karam Singh commented on YARN-2565: --- Observed that RM fails to start in Secure mode when GenericeHistoryService is enabled and ResourceManager is set to use Timeline Store {code} yarn.resourcemanager.keytab=RM_HOST yarn.resourcemanager.principal=RM_PRINCIPAL yarn.timeline-service.enabled=true yarn.timeline-service.hostname=ATS_HOST yarn.timeline-service.address=ATS_HOST:10200 yarn.timeline-service.webapp.address=ATS_HOST:8188 yarn.timeline-service.handler-thread-count=10 yarn.timeline-service.ttl-enable=true yarn.timeline-service.ttl-ms=60480 yarn.timeline-service.leveldb-timeline-store.path=/tm/timeline yarn.timeline-service.keytab=ATS_KEYTAB yarn.timeline-service.principal=ATS_PRINCIPAL yarn.timeline-service.webapp.spnego-principal=ATS_SPNEGO_PRINICPAL yarn.timeline-service.webapp.spnego-keytab-file=ATS_SPNEGO_KETAB yarn.timeline-service.http-authentication.type=kerberos yarn.timeline-service.http-authentication.kerberos.principal=ATS_SPNEGO_PRINICPAL yarn.timeline-service.http-authentication.kerberos.keytab=ATS_SPNEGO_KETAB yarn.timeline-service.generic-application-history.enabled=true yarn.timeline-service.generic-application-history.store-class='' yarn.resourcemanager.system-metrics-publisher.enabled=true yarn.resourcemanager.system-metrics-publisher.dispatcher.pool-size=10 {code} Stop ResoruceManager and Timelineserver Start Timelineserver. After ATS gets restart successfully. Start ResourceManager. RM fails to start with following exception : {code} 2014-09-15 10:58:57,735 WARN ipc.Client (Client.java:run(675)) - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 2014-09-15 10:58:57,740 ERROR applicationhistoryservice.FileSystemApplicationHistoryStore (FileSystemApplicationHistoryStore.java:serviceInit(132)) - Error when initializing FileSystemHistoryStorage java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: RM_HOST; destination host is: NN_HOST:8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) at org.apache.hadoop.ipc.Client.call(Client.java:1423) at org.apache.hadoop.ipc.Client.call(Client.java:1372) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:219) at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:748) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1918) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1105) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1101) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1101) at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1413) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.serviceInit(FileSystemApplicationHistoryStore.java:126) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.serviceInit(RMApplicationHistoryWriter.java:99) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:490) at
[jira] [Commented] (YARN-2565) ResourceManager is fails to start when GenericHistoryService is enabled in secure mode without doing manual kinit as yarn
[ https://issues.apache.org/jira/browse/YARN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138184#comment-14138184 ] Zhijie Shen commented on YARN-2565: --- [~karams], I think you've neglected mentioning the config: yarn.timeline-service.generic-application-history.enabled. It should be true, such that FileSystemApplicationHistoryStore is picked by RMApplicationHistoryWriter, which cannot access HDFS correctly in secure mode. After YARN-2033, when you enable generic history service, you should by default pick the new storage stack based on TimelineStore. The problem seems to be that the configurations which determine what store is chosen by ApplicationHistoryServer and RMApplicationHistoryWriter is not consistent. On RMApplicationHistoryWriter side, we should also use FileSystemApplicationHistoryStore only when users have explicitly put it in the config file. ResourceManager is fails to start when GenericHistoryService is enabled in secure mode without doing manual kinit as yarn - Key: YARN-2565 URL: https://issues.apache.org/jira/browse/YARN-2565 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Environment: Secure cluster with ATS (timeline server enabled) and yarn.resourcemanager.system-metrics-publisher.enabled=true so that RM can send Application history to Timeline Store Reporter: Karam Singh Assignee: Zhijie Shen Observed that RM fails to start in Secure mode when GenericeHistoryService is enabled and ResourceManager is set to use Timeline Store -- This message was sent by Atlassian JIRA (v6.3.4#6332)