[
https://issues.apache.org/jira/browse/CLOUDSTACK-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sangeetha Hariharan updated CLOUDSTACK-3451:
--------------------------------------------
Attachment: xenparallel.rar
> Parallel deployment - Xenserver - When deploying 30 Vms in parallel, some of
> the Vm deployment fails when “applying dhcp entry/applying userdata and
> password entry on router” and retry eventually happens when they succeed.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-3451
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3451
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.2.0
> Environment: Build from 4.2
> Reporter: Sangeetha Hariharan
> Priority: Critical
> Fix For: 4.2.0
>
> Attachments: xenparallel.rar
>
>
> Steps to reproduce the problem:
> Advanced zone set up with Xenserver host.
> Deploy 30 Vms in parallel.
> 9 of the Vm deployments actually had a failure when trying to “apply userdata
> and password entry on router” / “apply dhcp entry”. But in all these cases I
> see that we stop the Vm that is in “Starting” state and immediately attempt
> to start the Vm which succeeds this time.
> Issues:
> 1. We should not be seeing any failures during the jobs for "applying
> userdata and password entry on router” / “applying dhcp entry”.
> 2. Why is there a logic to retry the Vm deployment again ? In my case I have
> only 1 host in the setup which seems to be put in avoid state as part of the
> initial failure. But again the Vm gets successfully deployed in this host.
> Jobs that went thru this scenario are:
> [root@asfmgmt management]# grep -i "stopcommand" management-server.log |
> grep Executing
> 2013-07-09 16:30:05,707 DEBUG [agent.transport.Request]
> (Job-Executor-18:job-18) Seq 1-729350370: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-9-VM","wait":0}}]
> }
> 2013-07-09 16:30:12,547 DEBUG [agent.transport.Request]
> (Job-Executor-15:job-15) Seq 1-729350377: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-7-VM","wait":0}}]
> }
> 2013-07-09 16:30:20,990 DEBUG [agent.transport.Request]
> (Job-Executor-38:job-38) Seq 1-729350381: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-26-VM","wait":0}}]
> }
> 2013-07-09 16:30:23,529 DEBUG [agent.transport.Request]
> (Job-Executor-25:job-25) Seq 1-729350383: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-18-VM","wait":0}}]
> }
> 2013-07-09 16:30:32,231 DEBUG [agent.transport.Request]
> (Job-Executor-35:job-35) Seq 1-729350390: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-27-VM","wait":0}}]
> }
> 2013-07-09 16:30:45,744 DEBUG [agent.transport.Request]
> (Job-Executor-17:job-17) Seq 1-729350407: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-8-VM","wait":0}}]
> }
> 2013-07-09 16:30:46,511 DEBUG [agent.transport.Request]
> (Job-Executor-36:job-36) Seq 1-729350408: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-25-VM","wait":0}}]
> }
> 2013-07-09 16:31:04,826 DEBUG [agent.transport.Request]
> (Job-Executor-35:job-35) Seq 1-729350422: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-27-VM","wait":0}}]
> }
> 2013-07-09 16:31:17,707 DEBUG [agent.transport.Request]
> (Job-Executor-26:job-26) Seq 1-729350432: Executing: { Cmd , MgmtId:
> 7200344900649, via: 1, Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-15-VM","wait":0}}]
> }
> Management server log snippet:
> 2013-07-09 16:30:22,214 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-38:job-38) Successfully cleanued up resources for the vm VM[U
> ser|hello-16] in Starting state
> 2013-07-09 16:30:22,220 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) Deploy avoids pods: null, clusters: null,
> hosts: [1]
> 2013-07-09 16:30:22,220 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) DataCenter id = '1' provided is in avoid s
> et, DeploymentPlanner cannot allocate the VM, returning.
> 2013-07-09 16:30:22,236 DEBUG [cloud.capacity.CapacityManagerImpl]
> (Job-Executor-38:job-38) VM state transitted from :Starting to Stopped with
> event: OperationFailedvm's original host id: null new host id: null host id
> before state transition: 1
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl]
> (Job-Executor-38:job-38) Hosts's actual total CPU: 9044 and CPU after apply
> ing overprovisioning: 9044
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl]
> (Job-Executor-38:job-38) Hosts's actual total RAM: 16190149248 and RAM afte
> r applying overprovisioning: 16190149632
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl]
> (Job-Executor-38:job-38) release cpu from host: 1, old used: 4600,reserved:
> 0, actual total: 9044, total with overprovisioning: 9044; new used:
> 4500,reserved:0; movedfromreserved: false,moveToReserveredfalse
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl]
> (Job-Executor-38:job-38) release mem from host: 1, old used: 9602859008,res
> erved: 0, total: 16190149632; new used: 9340715008,reserved:0;
> movedfromreserved: false,moveToReserveredfalse
> 2013-07-09 16:30:22,256 DEBUG [cloud.capacity.CapacityManagerImpl]
> (Job-Executor-38:job-38) VM state transitted from :Stopped to Starting with
> event: StartRequestedvm's original host id: null new host id: null host id
> before state transition: null
> 2013-07-09 16:30:22,256 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-38:job-38) Successfully transitioned to start state for VM[Us
> er|hello-16] reservation id = 8b8d8303-0f78-4f03-9ee3-4cc3e129c746
> 2013-07-09 16:30:22,262 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-38:job-38) Trying to deploy VM, vm has dcId: 1 and podId: 1
> 2013-07-09 16:30:22,262 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-38:job-38) Deploy avoids pods: null, clusters: null, hosts: n
> ull
> 2013-07-09 16:30:22,269 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) Deploy avoids pods: null, clusters: null,
> hosts: null
> 2013-07-09 16:30:22,270 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) DeploymentPlanner allocation algorithm: co
> m.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_b2132c10@13aee390
> 2013-07-09 16:30:22,270 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) Trying to allocate a host and storage pool
> s from dc:1, pod:1,cluster:null, requested cpu: 100, requested ram: 262144000
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) Is ROOT volume READY (pool already allocat
> ed)?: No
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.FirstFitPlanner]
> (Job-Executor-38:job-38) Searching resources only under specified Pod: 1
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.FirstFitPlanner]
> (Job-Executor-38:job-38) Listing clusters in order of aggregate capacity,
> that ha
> ve (atleast one host with) enough CPU and RAM capacity under this Pod: 1
> 2013-07-09 16:30:22,280 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-38:job-38) Checking resources in Cluster: 1 under Pod
> : 1
> 2013-07-09 16:30:22,280 DEBUG [allocator.impl.FirstFitAllocator]
> (Job-Executor-38:job-38 FirstFitRoutingAllocator) Looking for hosts in dc: 1
> pod:1 cluster:1
> 2013-07-09 16:30:22,282 DEBUG [allocator.impl.FirstFitAllocator]
> (Job-Executor-38:job-38 FirstFitRoutingAllocator) FirstFitAllocator has 1 hos
> ts to check for allocation: [Host[-1-Routing]]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira