[
https://issues.apache.org/jira/browse/CLOUDSTACK-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025892#comment-14025892
]
edison su commented on CLOUDSTACK-5452:
---------------------------------------
If you restart agent, then agent should connect back to mgt server immediately.
If it doesn't work, then must be something else is wrong. Such as, the
configuration file (/etc/cloudstack/agent/agent.properties)on the kvm host is
out-of-date. Anyway, if you have both mgt server log and agent log, when the
tragedy happened, I can take a look at it.
> KVM - Agent is not able to connect back if management server was restarted
> when there are pending tasks to this host.
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-5452
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5452
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.3.0
> Environment: Build from 4.3
> Reporter: Sangeetha Hariharan
> Assignee: edison su
> Priority: Critical
> Fix For: Future
>
>
> KVM - Agent is not able to connect back if management server was restarted
> when there are pending tasks to this host.
> Steps to reproduce the problem:
> Set up - Advanced zone with 2 KVM ( RHEL 6.3) hosts.
> Deployed few Vms.
> Started snapshot for ROOT volume of the VMs.
> When the snapshot processes are still in progress , restart management
> server.
> When the management sever started , the KVM hosts remain in disconnected
> state.
> Attempt to stop Vms /start Vms fails because of having no connection to the
> host.
> Following is seen in agent logs:
> 2013-12-10 20:56:46,640 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:56:46,640 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:56:51,641 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:56:51,642 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:56:56,642 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:56:56,643 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:01,644 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:01,644 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:06,644 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:06,645 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:11,645 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:11,646 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:16,647 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:16,647 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:21,648 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:21,648 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:26,649 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:26,675 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:31,676 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:31,677 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:36,678 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> 2013-12-10 20:57:36,678 INFO [cloud.agent.Agent] (Agent-Handler-2:null)
> Cannot connect because we still have 1 commands in progress.
> 2013-12-10 20:57:41,678 INFO [cloud.agent.Agent] (Agent-Handler-2:null) Lost
> connection to the server. Dealing with the remaining commands...
> :
--
This message was sent by Atlassian JIRA
(v6.2#6252)