[jira] [Updated] (CLOUDSTACK-3451) Parallel deployment - Xenserver - When deploying 30 Vms in parallel, some of the Vm deployment fails when “applying dhcp entry/applying userdata and password entry on router” and retry eventually happens when they succeed.

Sangeetha Hariharan (JIRA) Wed, 10 Jul 2013 09:34:28 -0700

     [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sangeetha Hariharan updated CLOUDSTACK-3451:
--------------------------------------------

    Attachment: xenparallel.rar
    
> Parallel deployment - Xenserver - When deploying 30 Vms in parallel, some of 
> the Vm deployment fails when “applying dhcp entry/applying userdata and 
> password entry on router”  and retry eventually happens when they succeed.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-3451
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3451
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.2.0
>         Environment: Build from 4.2
>            Reporter: Sangeetha Hariharan
>            Priority: Critical
>             Fix For: 4.2.0
>
>         Attachments: xenparallel.rar
>
>
> Steps to reproduce the problem:
> Advanced zone set up with Xenserver host.
> Deploy 30 Vms in parallel.
> 9 of the Vm deployments actually had a failure when trying to “apply userdata 
> and password entry on router” / “apply dhcp entry”. But in all these cases I 
> see that we stop the Vm that is in “Starting” state and immediately attempt 
> to start the Vm which succeeds this time.
> Issues:
> 1. We should not be seeing any failures during the jobs for "applying 
> userdata and password entry on router” / “applying dhcp entry”.
> 2. Why is there a logic to retry the Vm deployment again ? In my case I have 
> only 1 host in the setup which seems to be put in avoid state as part of the 
> initial failure. But again the Vm gets successfully deployed in this host.
> Jobs that went thru this scenario are:
> [root@asfmgmt management]# grep -i "stopcommand" management-server.log  | 
> grep Executing
> 2013-07-09 16:30:05,707 DEBUG [agent.transport.Request] 
> (Job-Executor-18:job-18) Seq 1-729350370: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-9-VM","wait":0}}]
>  }
> 2013-07-09 16:30:12,547 DEBUG [agent.transport.Request] 
> (Job-Executor-15:job-15) Seq 1-729350377: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-7-VM","wait":0}}]
>  }
> 2013-07-09 16:30:20,990 DEBUG [agent.transport.Request] 
> (Job-Executor-38:job-38) Seq 1-729350381: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-26-VM","wait":0}}]
>  }
> 2013-07-09 16:30:23,529 DEBUG [agent.transport.Request] 
> (Job-Executor-25:job-25) Seq 1-729350383: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-18-VM","wait":0}}]
>  }
> 2013-07-09 16:30:32,231 DEBUG [agent.transport.Request] 
> (Job-Executor-35:job-35) Seq 1-729350390: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-27-VM","wait":0}}]
>  }
> 2013-07-09 16:30:45,744 DEBUG [agent.transport.Request] 
> (Job-Executor-17:job-17) Seq 1-729350407: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-8-VM","wait":0}}]
>  }
> 2013-07-09 16:30:46,511 DEBUG [agent.transport.Request] 
> (Job-Executor-36:job-36) Seq 1-729350408: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-25-VM","wait":0}}]
>  }
> 2013-07-09 16:31:04,826 DEBUG [agent.transport.Request] 
> (Job-Executor-35:job-35) Seq 1-729350422: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-27-VM","wait":0}}]
>  }
> 2013-07-09 16:31:17,707 DEBUG [agent.transport.Request] 
> (Job-Executor-26:job-26) Seq 1-729350432: Executing:  { Cmd , MgmtId: 
> 7200344900649, via: 1, Ver: v1, Flags: 100011, 
> [{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":false,"vmName":"i-3-15-VM","wait":0}}]
>  }
> Management server log snippet:
> 2013-07-09 16:30:22,214 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-38:job-38) Successfully cleanued up resources for the vm VM[U
> ser|hello-16] in Starting state
> 2013-07-09 16:30:22,220 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) Deploy avoids pods: null, clusters: null,
> hosts: [1]
> 2013-07-09 16:30:22,220 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) DataCenter id = '1' provided is in avoid s
> et, DeploymentPlanner cannot allocate the VM, returning.
> 2013-07-09 16:30:22,236 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-38:job-38) VM state transitted from :Starting to Stopped with
> event: OperationFailedvm's original host id: null new host id: null host id 
> before state transition: 1
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-38:job-38) Hosts's actual total CPU: 9044 and CPU after apply
> ing overprovisioning: 9044
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-38:job-38) Hosts's actual total RAM: 16190149248 and RAM afte
> r applying overprovisioning: 16190149632
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-38:job-38) release cpu from host: 1, old used: 4600,reserved:
> 0, actual total: 9044, total with overprovisioning: 9044; new used: 
> 4500,reserved:0; movedfromreserved: false,moveToReserveredfalse
> 2013-07-09 16:30:22,241 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-38:job-38) release mem from host: 1, old used: 9602859008,res
> erved: 0, total: 16190149632; new used: 9340715008,reserved:0; 
> movedfromreserved: false,moveToReserveredfalse
> 2013-07-09 16:30:22,256 DEBUG [cloud.capacity.CapacityManagerImpl] 
> (Job-Executor-38:job-38) VM state transitted from :Stopped to Starting with
> event: StartRequestedvm's original host id: null new host id: null host id 
> before state transition: null
> 2013-07-09 16:30:22,256 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-38:job-38) Successfully transitioned to start state for VM[Us
> er|hello-16] reservation id = 8b8d8303-0f78-4f03-9ee3-4cc3e129c746
> 2013-07-09 16:30:22,262 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-38:job-38) Trying to deploy VM, vm has dcId: 1 and podId: 1
> 2013-07-09 16:30:22,262 DEBUG [cloud.vm.VirtualMachineManagerImpl] 
> (Job-Executor-38:job-38) Deploy avoids pods: null, clusters: null, hosts: n
> ull
> 2013-07-09 16:30:22,269 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) Deploy avoids pods: null, clusters: null,
> hosts: null
> 2013-07-09 16:30:22,270 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) DeploymentPlanner allocation algorithm: co
> m.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_b2132c10@13aee390
> 2013-07-09 16:30:22,270 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) Trying to allocate a host and storage pool
> s from dc:1, pod:1,cluster:null, requested cpu: 100, requested ram: 262144000
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) Is ROOT volume READY (pool already allocat
> ed)?: No
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-38:job-38) Searching resources only under specified Pod: 1
> 2013-07-09 16:30:22,273 DEBUG [cloud.deploy.FirstFitPlanner] 
> (Job-Executor-38:job-38) Listing clusters in order of aggregate capacity, 
> that ha
> ve (atleast one host with) enough CPU and RAM capacity under this Pod: 1
> 2013-07-09 16:30:22,280 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl] 
> (Job-Executor-38:job-38) Checking resources in Cluster: 1 under Pod
> : 1
> 2013-07-09 16:30:22,280 DEBUG [allocator.impl.FirstFitAllocator] 
> (Job-Executor-38:job-38 FirstFitRoutingAllocator) Looking for hosts in dc: 1
> pod:1  cluster:1
> 2013-07-09 16:30:22,282 DEBUG [allocator.impl.FirstFitAllocator] 
> (Job-Executor-38:job-38 FirstFitRoutingAllocator) FirstFitAllocator has 1 hos
> ts to check for allocation: [Host[-1-Routing]]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CLOUDSTACK-3451) Parallel deployment - Xenserver - When deploying 30 Vms in parallel, some of the Vm deployment fails when “applying dhcp entry/applying userdata and password entry on router” and retry eventually happens when they succeed.

Reply via email to