[
https://issues.apache.org/jira/browse/AMBARI-9197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Lysnichenko resolved AMBARI-9197.
----------------------------------------
Resolution: Not A Problem
I've checked the described scenario:
- Deploy multinode cluster
- Start long-running process
- Stop one agent.
- Wait until task on affected host is automatically aborted (5-10 minutes)
- Start agent
- Check component and request states
- Try to issue a new request
As expected, tasks on host with stopped agent timed out and were aborted
automatically in 5-10 minutes. If agent on host is not running, one can not
send CANCEL commands to host.
Closing as Works as desired
> Ambari gets stuck / not able to cancel timed out operation
> ----------------------------------------------------------
>
> Key: AMBARI-9197
> URL: https://issues.apache.org/jira/browse/AMBARI-9197
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server, ambari-web
> Affects Versions: 1.7.0
> Environment: HDP 2.2
> Reporter: Hari Sekhon
> Assignee: Dmitry Lysnichenko
> Fix For: 2.1.0
>
> Attachments: screenshot-1.png
>
>
> Ambari server has recently had added the ability to cancel operations
> (AMBARI-1897) but is not able to cancel operations that are timing out in
> yellow and gets stuck in this state for several minutes, blocking restarts of
> other components.
> I've attached a screenshot which shows there is no X next to the operations
> in yellow that are stalled.
> This is the result of a hang on an ambari client (scenario documented in
> AMBARI-8768) but highlights that Ambari server's ability to cancel operations
> needs hardening and the ability to cancel any operation in any state to
> recover the operations queue.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)