[
https://issues.apache.org/jira/browse/CLOUDSTACK-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733461#comment-13733461
]
Koushik Das commented on CLOUDSTACK-4179:
-----------------------------------------
Thanks Sowmya for the clarification.
Ok so given that there are 3 VM deployments happening (2 VRs + 1 user VM) the
actual increase per vm is = 75/3 - 15/3 = 25 - 5 = 20 secs. (Assuming 75 secs.
for deploying 3 VMs now and 15 secs. for deploying 3 VMs pre 4.x)
I looked further into the logs (for the attached job id) and see that there are
discrete breaks in time indicating delays. Here is my analysis based on that:
Delay 1: (~30 secs.)
2013-08-07 01:52:09,414 DEBUG [cloud.network.NetworkModelImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Service
SecurityGroup is not supported in the network id=304
2013-08-07 01:52:09,785 DEBUG [cloud.network.NetworkModelImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Service
SecurityGroup is not supported in the network id=304
2013-08-07 01:52:37,288 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Deploy
avoids pods: null, clusters: null, hosts: null
2013-08-07 01:52:37,502 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ])
DeploymentPlanner allocation algorithm:
com.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_d3381e09@6b8db7bd
2013-08-07 01:52:37,502 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Trying to
allocate a host and storage pools from d
Delay 2: (~40 secs)
2013-08-07 01:52:41,199 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Trying to
deploy VM, vm has dcId: 1 and podId: null
2013-08-07 01:52:41,199 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ])
advanceStart: DeploymentPlan is provided, using dcId:1, podId: 8, clusterId:
15, hostId: 113, poolId: null
2013-08-07 01:52:41,199 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Deploy
avoids pods: null, clusters: null, hosts: null
2013-08-07 01:53:19,886 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Deploy
avoids pods: null, clusters: null, hosts: null
2013-08-07 01:53:19,890 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ])
DeploymentPlanner allocation algorithm:
com.cloud.deploy.FirstFitPlanner_EnhancerByCloudStack_d3381e09@6b8db7bd
2013-08-07 01:53:19,890 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Trying to
allocate a host and storage pools from dc:1, pod:8,cluster:15, requested cpu:
100, requested ram: 125829120
2013-08-07 01:53:19,890 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Is ROOT
volume READY (pool already allocated)?: No
Delay3: (~10 secs.)
2013-08-07 01:53:28,982 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) CPU STATS
after allocation: for host: 1649, old used: 0, old reserved: 0, actual total:
32000, total with overprovisioning: 52800; new used:500, reserved:0; requested
cpu:500,alloc_from_last:false
2013-08-07 01:53:28,982 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) RAM STATS
after allocation: for host: 1649, old used: 0, old reserved: 0, total:
8589934592; new used: 134217728, reserved: 0; requested mem:
134217728,alloc_from_last:false
2013-08-07 01:53:28,995 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) VM is
being created in podId: 104
2013-08-07 01:53:39,090 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Lock is
acquired for network id 200 as a part of network implement
2013-08-07 01:53:39,440 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Network
id=200 is already implemented
2013-08-07 01:53:39,448 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Lock is
released for network id 200 as a part of network implement
2013-08-07 01:53:40,453 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Asking
VirtualRouter to prepare for Nic[782-379-null-172.120.1.103]
Delay4: (~12 secs)
2013-08-07 01:53:56,965 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) CPU STATS
after allocation: for host: 570, old used: 0, old reserved: 0, actual total:
32000, total with overprovisioning: 52800; new used:500, reserved:0; requested
cpu:500,alloc_from_last:false
2013-08-07 01:53:56,965 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) RAM STATS
after allocation: for host: 570, old used: 0, old reserved: 0, total:
8589934592; new used: 134217728, reserved: 0; requested mem:
134217728,alloc_from_last:false
2013-08-07 01:53:56,979 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) VM is
being created in podId: 36
2013-08-07 01:54:07,135 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Lock is
acquired for network id 200 as a part of network implement
2013-08-07 01:54:07,149 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Network
id=200 is already implemented
2013-08-07 01:54:07,150 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Lock is
released for network id 200 as a part of network implement
2013-08-07 01:54:10,810 DEBUG [cloud.network.NetworkManagerImpl]
(Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Asking
VirtualRouter to prepare for Nic[786-381-null-172.120.1.103]
Looking at the code I see that the first 2 delays are due to the check for
avoiding dedicated resources in the planning stage. The method where all this
check happen is checkForNonDedicatedResources() method in
DeploymentPlanningManagerImpl.java. The first 2 delays of more than 30 secs
indicate that the check for dedicated resources is done twice.
> [Performance Testing] Time taken for Deploy VM async job to complete is
> considerably higher
> -------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-4179
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4179
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.2.0
> Environment: Simulated VMs and hosts, advanced zone, RVR
> Reporter: Sowmya Krishnan
> Assignee: Koushik Das
> Priority: Critical
> Fix For: 4.2.0
>
> Attachments: async_job_time.out, job-109.log
>
>
> Following setup is used to create performance test bed:
> Advanced zone, 2 Management Servers, 124 Pods [Each Pod having 2 Clusters]
> 248 Clusters [Each cluster having 8 hosts and one primary storage]
> 2000 Hosts
> 4000 User accounts [Each account having one network]
> ~8000 Virtual Routers
> Following config parameters were used in both the management servers
> - Java heap size = 5 GB
> - db.cloud.maxActive = 250
> -
> db.cloud.url.params=prepStmtCacheSize=517&cachePrepStmts=true&prepStmtCacheSqlLimit=4096&includeInnodbStatusInDeadlockExceptions=true&logSlowQueries=true
> Tried deploying around 4000 Simulator VMs across 4K accounts. Noticing
> considerably higher time for deploy VM async job to complete as compared to
> baseline results.
> Some of the Baseline results are documented here:
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Baseline+Reports+for+Pre-4.x+Performance+Runs
> On an average, it is now taking 70-80 seconds for deployVM to complete as
> compared to 10-15 seconds earlier.
> From the logs, it appears in some cases, DeploymentPlanningManager takes time
> to kick in.:
> 2013-08-07 01:52:09,785 DEBUG [cloud.network.NetworkModelImpl]
> (Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Service
> SecurityGroup is not supported in the network id=304
> 2013-08-07 01:52:37,288 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Deploy
> avoids pods: null, clusters: null, hosts: null
> ...
> ...
> 2013-08-07 01:52:41,199 DEBUG [cloud.vm.VirtualMachineManagerImpl]
> (Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Deploy
> avoids pods:
> null, clusters: null, hosts: null
> 2013-08-07 01:53:19,886 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
> (Job-Executor-105:job-109 = [ 19ccfecb-c815-497a-9ff1-70d590025af3 ]) Deploy
> avoid
> s pods: null, clusters: null, hosts: null
> Both VRs are running at the end of deploy VM and also the VM.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira