[
https://issues.apache.org/jira/browse/AMBARI-21204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmytro Sen updated AMBARI-21204:
--------------------------------
Attachment: AMBARI-21204_3.patch
> Yarn stopped by itself after start. HA run
> ------------------------------------------
>
> Key: AMBARI-21204
> URL: https://issues.apache.org/jira/browse/AMBARI-21204
> Project: Ambari
> Issue Type: Bug
> Affects Versions: 2.5.1
> Reporter: Dmytro Sen
> Assignee: Dmytro Sen
> Priority: Critical
> Fix For: 2.5.2
>
> Attachments: AMBARI-21204_3.patch
>
>
> From RM logs :
> {code}
> 2017-06-07 14:23:19,191 FATAL resourcemanager.ResourceManager
> (ResourceManager.java:main(1240)) - Error starting ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException:
> Couldn't set ACLs on parent ZNode: /yarn-leader-election
> at
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236)
> Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode:
> /yarn-leader-election
> at
> org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351)
> at
> org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> ... 7 more
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode = BadVersion for /yarn-leader-election
> {code}
> The problem is that disabling security changes zk ACL for resource manager as
> part of AMBARI-19331. After the recent change in HDFS-11403, RM checks znode
> version and fails if it's different than expected.
> The correct fix could be to remove znode during security disabling and do not
> break election znode consistency by manually changing ACL to all. RM should
> create it with proper ACL.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)