Sailaja Mada created CLOUDSTACK-4593:
----------------------------------------
Summary: [VMWARE] [Upgrade]Livestorage Migration & VM Snapshot
features are not fully functional after upgrade
Key: CLOUDSTACK-4593
URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4593
Project: CloudStack
Issue Type: Bug
Security Level: Public (Anyone can view this level - this is the default.)
Components: Storage Controller, Upgrade, VMware
Affects Versions: 4.2.1
Reporter: Sailaja Mada
Steps:
Upgraded setup :
307 Setup with 2 clusters ( Cluster1 – 2 ESXi 5.0 hosts , Cluster2 – Esxi 4.1
host)
1. 11 Vm’s deployed with 2 VM’s in stopped state
2. VM’s with ROOT volume, 2 DATA volumes, 3 DATA volumes
3. Snapshots , Template / volumes from this snapshot
4. Detached volumes
5. Empty volumes which are not attached to any instance
6. Cluster with 2 primary storage's
Upgraded to 4.2
I got into below failures with VM Snapshot if tried without Stop/Start of the
VM Post upgrade :
2013-09-02 21:35:36,733 WARN [vmware.resource.VmwareResource]
(DirectAgent-320:10.102.192.23) StartCommand failed due to Exception:
java.lang.RuntimeException
Message: File <unspecified filename> was not found
java.lang.RuntimeException: File <unspecified filename> was not found
at
com.cloud.hypervisor.vmware.util.VmwareClient.waitForTask(VmwareClient.java:378)
at
com.cloud.hypervisor.vmware.mo.VirtualMachineMO.powerOn(VirtualMachineMO.java:188)
at
com.cloud.hypervisor.vmware.resource.VmwareResource.execute(VmwareResource.java:2966)
at
com.cloud.hypervisor.vmware.resource.VmwareResource.executeRequest(VmwareResource.java:519)
at
com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:186)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2013-09-02 21:35:36,742 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-320:null) Seq 6-949618439: Response Received:
2013-09-02 21:35:43,972 INFO [storage.volume.VolumeServiceImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Volume 52
is not referred anywhere, remove it from volumes table
2013-09-02 21:35:43,981 ERROR [cloud.storage.VolumeManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) migrate
volume failed:copy volume from primary to secondary failed due to exception:
Exception: java.lang.Exception
Message: Unable to find related disk device for volume. volume path:
ROOT-14-30-000004
2013-09-02 21:35:43,990 INFO [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Unable to
contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:203] is
unreachable: migrate volume failed: copy volume from primary to secondary
failed due to exception: Exception: java.lang.Exception
Message: Unable to find related disk device for volume. volume path:
ROOT-14-30-000004
at
com.cloud.storage.VolumeManagerImpl.migrateVolume(VolumeManagerImpl.java:2254)
at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at
com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2590)
at
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:888)
at
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
at
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:237)
at
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:1948)
at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at
org.apache.cloudstack.api.command.user.vm.StartVMCmd.execute(StartVMCmd.java:120)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2013-09-02 21:35:44,008 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Cleaning
up resources for the vm VM[User|instance6withdefaulttempalte] in Starting state
2013-09-02 21:35:44,009 DEBUG [agent.transport.Request]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Seq
8-343475044: Sending { Cmd , MgmtId: 227594284004867, via: 8, Ver: v1, Flags:
100111,
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"i-3-14-VM","wait":0}}]
}
2013-09-02 21:35:44,009 DEBUG [agent.transport.Request]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Seq
8-343475044: Executing: { Cmd , MgmtId: 227594284004867, via: 8, Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"i-3-14-VM","wait":0}}]
}
2013-09-02 21:35:44,010 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-106:null) Seq 8-343475044: Executing request
2013-09-02 21:35:44,010 INFO [vmware.resource.VmwareResource]
(DirectAgent-106:10.102.192.14) Executing resource StopCommand:
{"isProxy":false,"executeInSequence":true,"vmName":"i-3-14-VM","wait":0}
But finally VM got started moving to other host in the cluser: Note : There is
no gurantee that this feature works fine without restart of the VM . After this
chain info got updated (chain_info:
{"diskDeviceBusName":"ide0:1","diskChain":["[61105200a56c3a5f9c83783d3d36a9d1]
i-3-14-VM/436fd32b103746d29771a2b10df35cf2.vmdk"]})
2013-09-02 21:35:44,165 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Trying to
find a potenial host and associated storage pools from the suitable host/pool
lists for this VM
2013-09-02 21:35:44,165 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Checking
if host: 1 can access any suitable storage pool for volume: ROOT
2013-09-02 21:35:44,166 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Host: 1
can access pool: 201
2013-09-02 21:35:44,168 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Found a
potential host id: 1 name: 10.102.192.20 and associated storage pools for this
VM
2013-09-02 21:35:44,169 DEBUG [cloud.deploy.DeploymentPlanningManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Returning
Deployment Destination:
Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
: Dest[Zone(1)-Pod(1)-Cluster(1)-Host(1)-Storage(Volume(30|ROOT-->Pool(201))]
2013-09-02 21:35:44,169 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) Deployment
found - P0=VM[User|instance6withdefaulttempalte],
P0=Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
: Dest[Zone(1)-Pod(1)-Cluster(1)-Host(1)-Storage(Volume(30|ROOT-->Pool(201))]
2013-09-02 21:35:44,176 DEBUG [cloud.server.StatsCollector]
(StatsCollector-3:null) HostStatsCollector is running...
2013-09-02 21:35:44,191 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-75:job-151 = [ 3e8e2327-7c56-45ea-a653-ee5b89d75d57 ]) VM state
transitted from :Starting to Starting with event: OperationRetryvm's original
host id: 6 new host id: 1 host id before state transition: 8
Second Scenario:
1. Create a VM in 307
2. Do not perform any stop and start after upgrade to 4.2
3. Live storage migrate to second cluster , then take a volume snapshot
4. Create the template from this volume snapshot after live storage migration
5. Try deployment of the VM using this template. Deployment works and VM is
up & running( I could login and create files also) .
6. But during deployment there were errors while creating the Volume. Later it
finds one more storage pool / host and creates volume on this new storage pool.
Finally deploy VM works fine,
ERROR:
2013-09-02 23:51:11,130 INFO [storage.volume.VolumeServiceImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) releasing
lock for VMTemplateStoragePool 19
2013-09-02 23:51:11,131 WARN [utils.db.Merovingian2] (Job-Executor-106:job-182
= [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Was unable to find lock for the key
template_spool_ref19 and thread id 1340523980
2013-09-02 23:51:11,131 DEBUG [cloud.storage.VolumeManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Unable to
create Vol[59|vm=25|ROOT]:Unable to copy template to primary storage due to
exception:Exception: javax.xml.ws.soap.SOAPFaultException
Message:
Required parameter spec is missing
while parsing call information for method ImportVApp
at line 1, column 110
while parsing SOAP body
at line 1, column 102
while parsing SOAP envelope
at line 1, column 38
while parsing HTTP request for method importVApp
on object of type vim.ResourcePool
at line 1, column 0
2013-09-02 23:51:11,131 INFO [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Unable to
contact resource.
com.cloud.exception.StorageUnavailableException: Resource [StoragePool:202] is
unreachable: Unable to create Vol[59|vm=25|ROOT]:Unable to copy template to
primary storage due to exception:Exception: javax.xml.ws.soap.SOAPFaultException
Message:
Required parameter spec is missing
while parsing call information for method ImportVApp
at line 1, column 110
while parsing SOAP body
at line 1, column 102
while parsing SOAP envelope
at line 1, column 38
while parsing HTTP request for method importVApp
on object of type vim.ResourcePool
at line 1, column 0
at
com.cloud.storage.VolumeManagerImpl.recreateVolume(VolumeManagerImpl.java:2544)
at
com.cloud.storage.VolumeManagerImpl.prepare(VolumeManagerImpl.java:2592)
at
com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:888)
at
com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:578)
at
org.apache.cloudstack.engine.cloud.entity.api.VMEntityManagerImpl.deployVirtualMachine(VMEntityManagerImpl.java:227)
at
org.apache.cloudstack.engine.cloud.entity.api.VirtualMachineEntityImpl.deploy(VirtualMachineEntityImpl.java:209)
at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:3406)
at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2966)
at
com.cloud.vm.UserVmManagerImpl.startVirtualMachine(UserVmManagerImpl.java:2952)
at
com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
at
org.apache.cloudstack.api.command.user.vm.DeployVMCmd.execute(DeployVMCmd.java:420)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
at
com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2013-09-02 23:51:11,139 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Cleaning
up resources for the vm VM[User|xxxx] in Starting state
2013-09-02 23:51:11,141 DEBUG [agent.transport.Request]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Seq
6-949618915: Sending { Cmd , MgmtId: 227594284004867, via: 6, Ver: v1, Flags:
100111,
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"i-3-25-VM","wait":0}}]
}
2013-09-02 23:51:11,141 DEBUG [agent.transport.Request]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Seq
6-949618915: Executing: { Cmd , MgmtId: 227594284004867, via: 6, Ver: v1,
Flags: 100111,
[{"com.cloud.agent.api.StopCommand":{"isProxy":false,"executeInSequence":true,"vmName":"i-3-25-VM","wait":0}}]
}
2013-09-02 23:51:11,141 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-91:null) Seq 6-949618915: Executing request
2013-09-02 23:51:11,141 INFO [vmware.resource.VmwareResource]
(DirectAgent-91:10.102.192.23) Executing resource StopCommand:
{"isProxy":false,"executeInSequence":true,"vmName":"i-3-25-VM","wait"
Deploy VM – PASS:
2013-09-02 23:51:29,848 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-82:null) Seq 2-618137852: Response Received:
2013-09-02 23:51:29,849 DEBUG [agent.transport.Request] (DirectAgent-82:null)
Seq 2-618137852: Processing: { Ans: , MgmtId: 227594284004867, via: 2, Ver:
v1, Flags: 110,
[{"com.cloud.agent.api.StartAnswer":{"vm":{"id":25,"name":"i-3-25-VM","bootloader":"HVM","type":"User","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":536870912,"maxRam":536870912,"hostName":"xxxx","arch":"x86_64","os":"CentOS
5.3
(64-bit)","bootArgs":"","rebootOnCrash":false,"enableHA":false,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"95b29c111ca96aee","params":{"nicAdapter":"E1000","vmware.reserve.cpu":"false","nestedVirtualizationFlag":"false","Message.ReservedCapacityFreed.Flag":"false","rootDiskController":"ide","vmware.reserve.mem":"false"},"uuid":"021f2108-2c83-485a-ab6a-e3eab3fc585c","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"b14d9ecc-0474-4aa7-9b7d-3f8620381e64","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"61105200-a56c-3a5f-9c83-783d3d36a9d1","id":201,"poolType":"NetworkFilesystem","host":"10.102.192.100","path":"/cpg_vol/sailaja/workps2","port":2049}},"name":"ROOT-25","size":2147483648,"path":"ROOT-25","volumeId":59,"vmName":"i-3-25-VM","accountId":3,"chainInfo":"{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[61105200a56c3a5f9c83783d3d36a9d1]
i-3-25-VM/ROOT-25.vmdk\",\"[61105200a56c3a5f9c83783d3d36a9d1]
6a9fa9a8a4af3849a58167751abc214f/6a9fa9a8a4af3849a58167751abc214f.vmdk\"]}","format":"OVA","id":59,"hypervisorType":"VMware"}},"diskSeq":0,"type":"ROOT"},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":0,"format":"ISO","accountId":0,"hvm":false}},"diskSeq":3,"type":"ISO"}],"nics":[{"deviceId":0,"networkRateMbps":200,"defaultNic":true,"uuid":"710aa816-6e4a-4f2e-b876-3bb901e881e4","ip":"10.1.1.12","netmask":"255.255.255.0","gateway":"10.1.1.1","mac":"02:00:4c:e2:00:0c","dns1":"10.103.128.15","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://786","isolationUri":"vlan://786","isSecurityGroupEnabled":false}]},"result":true,"wait":0}}]
}
2013-09-02 23:51:29,849 DEBUG [agent.transport.Request]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Seq
2-618137852: Received: { Ans: , MgmtId: 227594284004867, via: 2, Ver: v1,
Flags: 110, { StartAnswer } }
2013-09-02 23:51:29,852 DEBUG [agent.manager.AgentAttache]
(DirectAgent-82:null) Seq 2-618137852: No more commands found
2013-09-02 23:51:29,872 INFO [cloud.storage.VolumeManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Update
volume disk chain info. vol: 59, ROOT-25 -> ROOT-25, null ->
{"diskDeviceBusName":"ide0:1","diskChain":["[61105200a56c3a5f9c83783d3d36a9d1]
i-3-25-VM/ROOT-25.vmdk","[61105200a56c3a5f9c83783d3d36a9d1]
6a9fa9a8a4af3849a58167751abc214f/6a9fa9a8a4af3849a58167751abc214f.vmdk"]}
2013-09-02 23:51:29,919 DEBUG [cloud.network.NetworkModelImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Service
SecurityGroup is not supported in the network id=204
2013-09-02 23:51:29,924 DEBUG [cloud.network.NetworkModelImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Service
SecurityGroup is not supported in the network id=204
2013-09-02 23:51:29,931 DEBUG [cloud.capacity.CapacityManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) VM state
transitted from :Starting to Running with event: OperationSucceededvm's
original host id: 2 new host id: 2 host id before state transition: 2
2013-09-02 23:51:29,932 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Start
completed for VM VM[User|xxxx]
2013-09-02 23:51:29,951 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Complete
async job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ], jobStatus: 1,
resultCode: 0, result:
org.apache.cloudstack.api.response.UserVmResponse@3ac3beea
2013-09-02 23:51:29,963 DEBUG [cloud.async.AsyncJobManagerImpl]
(Job-Executor-106:job-182 = [ ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]) Done
executing org.apache.cloudstack.api.command.user.vm.DeployVMCmd for job-182 = [
ed8268f7-bb81-4f02-a94b-3500d4f1d5cf ]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira