[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

prashant kumar mishra updated CLOUDSTACK-5055:
----------------------------------------------

    Summary: host went in Error in maintenance state ;unable to migrate vms  
(was: host went in Error in maintenance mode ;unable to migrate vms)

> host went in Error in maintenance state ;unable to migrate vms
> --------------------------------------------------------------
>
>                 Key: CLOUDSTACK-5055
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5055
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: KVM, Management Server
>    Affects Versions: 4.2.0
>            Reporter: prashant kumar mishra
>         Attachments: Agent_MS_Logs.rar
>
>
> Steps to reproduce
> -------------------------
> -------------------------
> 1-preapare CS setup with kvm(rhel6.2) say host1
> 2-set execute.in.sequence.hypervisor.commands and 
> execute.in.sequence.network.element.commands to false
> 3-deploye 32 vms 
> 4-add one more host  say host 2in cluster
> 5-try to put host1 in maintenance mode
> Expected
> ---------------
> Host1 should go in maintenance mode 
> Actual
> ---------
> Host1 stuck in "Error In maintenance" state and few vms got migrated to host2
> My observation 
> ---------------------
> 1-i tried same with 3 vms user vms and system vms  , enabling maintenance  
> worked properly ,
> 2-I saw this issue only when there are large number(32+) vms are there in a 
> host
> Logs
> --------
> 2013-11-06 09:53:27,424 DEBUG [agent.manager.AgentAttache] 
> (AgentManager-Handler-8:null) Seq 4-2144927817: Unable to find listener.
> 2013-11-06 09:53:27,426 DEBUG [vm.dao.VMInstanceDaoImpl] 
> (HA-Worker-4:work-34) Unable to update 
> VM[User|f66d29c2-2cd2-4715-ae31-5e43cea707bf]: DB Data={Host=1; 
> State=Running; updated=7; time=Wed Nov 06 09:53:27 EST 2013} New Data: 
> {Host=1; State=Stopping; updated=6; time=Wed Nov 06 09:53:27 EST 2013} Stale 
> Data: {Host=1; State=Running; updated=5; time=Wed Nov 06 09:53:25 EST 2013}
> 2013-11-06 09:53:27,435 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (HA-Worker-4:work-34) Unable to stop VM due to VM is being operated on.
> 2013-11-06 09:53:27,435 WARN  [cloud.ha.HighAvailabilityManagerImpl] 
> (HA-Worker-4:work-34) Unable to migrate vm from 1
> 2013-11-06 09:53:27,432 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (HA-Worker-3:work-38) DeploymentPlanner allocation algorithm: 
> com.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_e995abc3@d603051
> 2013-11-06 09:53:27,435 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (HA-Worker-3:work-38) Trying to allocate a host and storage pools from dc:1, 
> pod:1,cluster:1, requested cpu: 200, requested ram: 134217728
> 2013-11-06 09:53:27,435 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (HA-Worker-3:work-38) Is ROOT volume READY (pool already allocated)?: No
> 2013-11-06 09:53:27,435 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (HA-Worker-3:work-38) This VM has last host_id specified, trying to choose 
> the same host: 1
> 2013-11-06 09:53:27,437 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (HA-Worker-3:work-38) The last host of this VM is in avoid set
> 2013-11-06 09:53:27,437 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (HA-Worker-3:work-38) Cannot choose the last host to deploy this VM
> 2013-11-06 09:53:27,437 DEBUG [cloud.deploy.FirstFitPlanner] 
> (HA-Worker-3:work-38) Searching resources only under specified Cluster: 1
> 2013-11-06 09:53:27,440 DEBUG [cloud.resource.ResourceManagerImpl] 
> (HA-Worker-4:work-34) No next resource state for host 1 while current state 
> is ErrorInMaintenance with event UnableToMigrate
> com.cloud.utils.fsm.NoTransitionException: No next resource state found for 
> current state =ErrorInMaintenance event =UnableToMigrate
>         at 
> com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1178)
>         at 
> com.cloud.resource.ResourceManagerImpl.maintenanceFailed(ResourceManagerImpl.java:2313)
>         at 
> com.cloud.ha.HighAvailabilityManagerImpl.migrate(HighAvailabilityManagerImpl.java:602)
>         at 
> com.cloud.ha.HighAvailabilityManagerImpl$WorkerThread.run(HighAvailabilityManagerImpl.java:858)
> 2013-11-06 09:53:27,451 DEBUG [agent.transport.Request] 
> (AgentManager-Handler-10:null) Seq 1-1113784382: Processing:  { Ans: , 
> MgmtId: 6959054979131, via: 1, Ver: v1, Flags: 110, 
> [{"com.cloud.agent.api.MigrateAnswer":{"result":false,"details":"Cannot recv 
> data: Connection reset by peer","wait":0}}] }
> 2013-11-06 09:53:27,451 DEBUG [agent.manager.AgentAttache] 
> (AgentManager-Handler-10:null) Seq 1-1113784382: No more commands found
> 2013-11-06 09:53:27,451 DEBUG [agent.transport.Request] (HA-Worker-0:work-35) 
> Seq 1-1113784382: Received:  { Ans: , MgmtId: 6959054979131, via: 1, Ver: v1, 
> Flags: 110, { MigrateAnswer } }
> 2013-11-06 09:53:27,451 ERROR [cloud.vm.VirtualMachineManagerImpl] 
> (HA-Worker-0:work-35) Unable to migrate due to Cannot recv data: Connection 
> reset by peer
> 2013-11-06 09:53:27,452 INFO  [cloud.vm.VirtualMachineManagerImpl] 
> (HA-Worker-0:work-35) Migration was unsuccessful.  Cleaning up: 
> VM[User|b363903f-992c-412a-ab8d-a9bb15e23a51]
> 2013-11-06 09:53:27,449 DEBUG [agent.transport.Request] 
> (AgentManager-Handler-9:null) Seq 4-2144927816: Processing:  { Ans: , MgmtId: 
> 6959054979131, via: 4, Ver: v1, Flags: 110, 
> [{"com.cloud.agent.api.PrepareForMigrationAnswer":{"result":true,"wait":0}}] }
> 2013-11-06 09:53:27,452 DEBUG [agent.manager.AgentAttache] 
> (AgentManager-Handler-9:null) Seq 4-2144927816: No more commands found
> 2013-11-06 09:53:27,452 DEBUG [agent.transport.Request] (HA-Worker-2:work-36) 
> Seq 4-2144927816: Received:  { Ans: , MgmtId: 6959054979131, via: 4, Ver: v1, 
> Flags: 110, { PrepareForMigrationAnswer } }
> 2013-11-06 09:53:27,458 INFO  [cloud.ha.HighAvailabilityManagerImpl] 
> (HA-Worker-4:work-34) Completed HAWork[34-Migration-27-Running-Migrating]



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to