[ https://issues.apache.org/jira/browse/AMBARI-22060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitry Lysnichenko updated AMBARI-22060: ---------------------------------------- Attachment: AMBARI-22060.patch > Fail to restart Ranger Admin during HDP downgrade. > --------------------------------------------------- > > Key: AMBARI-22060 > URL: https://issues.apache.org/jira/browse/AMBARI-22060 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Reporter: Dmitry Lysnichenko > Assignee: Dmitry Lysnichenko > Priority: Critical > Attachments: AMBARI-22060.patch > > > During the downgrade process, run into the following error whilst it's > restating Ranger Admin: > {code} > Traceback (most recent call last): > File > "/var/lib/ambari-agent/cache/common-services/RANGER/0.4.0/package/scripts/ranger_admin.py", > line 216, in > RangerAdmin().execute() > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 329, in execute > method(env) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", > line 850, in restart > self.start(env, upgrade_type=upgrade_type) > File > "/var/lib/ambari-agent/cache/common-services/RANGER/0.4.0/package/scripts/ranger_admin.py", > line 93, in start > setup_ranger_audit_solr() > File > "/var/lib/ambari-agent/cache/common-services/RANGER/0.4.0/package/scripts/setup_ranger_xml.py", > line 705, in setup_ranger_audit_solr > new_service_principals = [params.ranger_admin_jaas_principal]) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/solr_cloud_util.py", > line 329, in add_solr_roles > new_service_users.append(__remove_host_from_principal(new_service_user, > kerberos_realm)) > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/solr_cloud_util.py", > line 266, in __remove_host_from_principal > if not realm: > File > "/usr/lib/python2.6/site-packages/resource_management/libraries/script/config_dictionary.py", > line 73, in __getattr__ > raise Fail("Configuration parameter '" + self.name + "' was not found in > configurations dictionary!") > resource_management.core.exceptions.Fail: Configuration parameter > 'kerberos-env' was not found in configurations dictionary! > {code} > The reason was that server did not have many configs selected, and did not > send them to agent during downgrade. There are few issues here: > - During upgrade from 2.4 to 2.5, finalize did not update current cluster > version. As a result config helpers have gone mad > - As a result of previous issue, some Configure tasks failed to execute > - During downgrade from 2.6 , looks like cluster entity DB state was not > consistent after config selection, so sometimes configs were not selected is > some cases. I managed to reproduce that only once, > it's a race condition that is very hard to catch/trace in debugger. -- This message was sent by Atlassian JIRA (v6.4.14#64029)