[
https://issues.apache.org/jira/browse/AMBARI-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Myroslav Papirkovskyi updated AMBARI-15691:
-------------------------------------------
Attachment: (was: AMBARI-15691.patch)
> Express Upgrade hangs if ambari agent is restarted in the middle of EU
> ----------------------------------------------------------------------
>
> Key: AMBARI-15691
> URL: https://issues.apache.org/jira/browse/AMBARI-15691
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.2.2
> Reporter: Myroslav Papirkovskyi
> Assignee: Myroslav Papirkovskyi
> Priority: Blocker
> Fix For: 2.2.2
>
>
> *Steps*
> # Install HDP-2.4.0.0 with Ambari 2.2.2 (secure, non-HA cluster)
> # Start EU to 2.4.2.0-127 and reach till "Backup Knox data" prompt
> # Hit Proceed at "backup Knox data" message
> # Stop ambari agent on two of the cluster hosts and wait for EU to fail with
> "HOLDING_TIMEDOUT" status (in my test EU stopped at "Snapshot HBase" task)
> # Start the agents on both hosts and wait 90 secs. for agents to heartbeat
> # Retry the failed task
> *Result*
> EU hangs
> From ambari-server log:
> {code}
> 04 Apr 2016 08:20:14,729 WARN [ambari-action-scheduler] ActionScheduler:201
> - Exception received
> java.lang.NullPointerException
> at
> org.apache.ambari.server.actionmanager.ActionScheduler.wasAgentRestartedDuringOperation(ActionScheduler.java:887)
> at
> org.apache.ambari.server.actionmanager.ActionScheduler.processInProgressStage(ActionScheduler.java:691)
> at
> org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:289)
> at
> org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:196)
> at java.lang.Thread.run(Thread.java:745)
> 04 Apr 2016 08:30:29,451 WARN [ambari-action-scheduler] ActionScheduler:695
> - Detected ambari-agent restart during command execution.The command has been
> aborted.Execution command details: host:
> os-d7-ngzvlu-ambari-se-eu-10-2.novalocal, role: ru_execute_tasks, actionId:
> 19-27
> 04 Apr 2016 08:30:30,581 WARN [ambari-action-scheduler] ActionScheduler:695
> - Detected ambari-agent restart during command execution.The command has been
> aborted.Execution command details: host:
> os-d7-ngzvlu-ambari-se-eu-10-2.novalocal, role: ru_execute_tasks, actionId:
> 19-27
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)