Jonathan Hurley created AMBARI-13615:
----------------------------------------
Summary: Express Upgrade: ZKFC Cannot Stop Because Newer Binaries
Don't Exist
Key: AMBARI-13615
URL: https://issues.apache.org/jira/browse/AMBARI-13615
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.1.3
Reporter: Jonathan Hurley
Assignee: Jonathan Hurley
Priority: Critical
Fix For: 2.1.3
During an express upgrade, components are stopped ahead of time. Before
{{restart}} is invoked, the following task runs updating all hdp pointers:
{code}
<group xsi:type="cluster" name="RESTORE_CONFIG_DIRS" title="Restore
Configuration Directories">
<direction>DOWNGRADE</direction>
<execute-stage title="Restore configuration directories and remove HDP
2.3 symlinks">
<task xsi:type="execute">
<script>scripts/ru_set_all.py</script>
<function>unlink_all_configs</function>
</task>
</execute-stage>
</group>
{code}
After this, all components begin to restart. However, restarting involves a
{{stop}} and a {{start}} command. The components are already stopped and most
of them have logic that says if the PID says it's not running, then don't stop
it twice.
However, some components like ZKFC and HBase Master don't have this logic and
try stopping it regardless. The problem arises when a JVM is spun up to stop
the process:
Initially it was though that moving the {{hdp-select set all}} to after the
{{restart}} groups would solve the problem. As it turns out, moving the
{{hdp-select set all}} doesn't work since the {{params.py}} it always taking
the new version and building conf/lib/bin directories with it. Additionally,
some components have upgrade bugs which calling {{hdp-select set all}}
corrects.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)