[
https://issues.apache.org/jira/browse/CLOUDSTACK-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805701#comment-13805701
]
Ivan Kozlov commented on CLOUDSTACK-4939:
-----------------------------------------
I think I know what's going on. Management server send commands to
create/backup/delete snapshots randomly to any host. Depending on VM is running
or not on the host, host uses libvirt or qemu-img.
For example host1 creates snapshot using qemu-img. Command snapshot backup is
sent to host2, where VM is running. host2 tries to backup snapshot using
libvirt but there is no snapshot visible for that domain (because it was
created with qemu-img). so backup fails. the more hosts are in cluster the more
possible is snapshot failure.
So we need to
1. Check if VM is running before snapshot.
2. If VM is running send all commands (create/backup/delete snapshot) only to
the host where it is running.
To my mind this should solve the issue. Maybe someone can do this?
> Failed to create snaphot (KVM, GFS2)
> ------------------------------------
>
> Key: CLOUDSTACK-4939
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4939
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: KVM, Snapshot
> Affects Versions: 4.2.0, 4.2.1
> Environment: CentOS 6.4, KVM, Shared mount point primary storage,
> GFS2, iSCSI
> Reporter: Ivan Kozlov
> Priority: Blocker
> Labels: kvm, sharedstorage, snapshot
> Fix For: 4.2.1
>
>
> With one host snapshots are created ok. After adding second host some
> snapshots fail (Failed to create snapshot due to an internal error creating
> snapshot for volume 14) stucking with state "CreatedOnPrimary". Even when all
> VMs are running on the same host.
> debug libvirt log shows:
> 2013-10-23 17:31:21.634+0000: 20007: debug :
> virStorageFileGetMetadataInternal:673 :
> path=/mnt/48a148f6-3373-3af2-8667-2f240988163d/snapshots, fd=31, format=2
> 2013-10-23 17:32:57.189+0000: 20015: debug : qemuSnapObjFromName:233 : Domain
> snapshot not found: no domain snapshot with matching name
> '909848a0-b3ec-4657-a53a-c449dc24365b'
> 2013-10-23 17:32:57.474+0000: 20009: debug :
> virStorageFileGetMetadataInternal:673 :
> path=/mnt/48a148f6-3373-3af2-8667-2f240988163d/snapshots, fd=31, format=2
> 2013-10-23 17:34:28.264+0000: 20008: debug : qemuSnapObjFromName:233 : Domain
> snapshot not found: no domain snapshot with matching name
> 'f4e51b11-ac79-4a6a-b887-8926ffbd5cca'
> management server log:
> 2013-10-23 20:29:50,561 INFO [user.snapshot.CreateSnapshotCmd]
> (Job-Executor-52:job-94 = [ 42f8d6e0-762e-4f01-a7d5-daff2e31be13 ]) VOLSS:
> createSnapshotCmd starts:1382549390561
> 2013-10-23 20:29:52,053 DEBUG [agent.transport.Request]
> (Job-Executor-52:job-94 = [ 42f8d6e0-762e-4f01-a7d5-daff2e31be13 ]) Seq
> 6-1170407437: Waiting for Seq 1170407434 Scheduling: { Cmd , MgmtId:
> 161342718518, via: 6, Ver: v1, Flags: 100111,
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"/primary/d59c6574-8ff9-41e4-86e5-ce560f30d717/f4e51b11-ac79-4a6a-b887-8926ffbd5cca","volume":{"uuid":"02c07659-59d3-42f2-8928-1d899cef94e7","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"2c8e7b93-2d02-4c47-99ce-7bcd8670554a","id":2,"poolType":"SharedMountPoint","host":"localhost","path":"/primary","port":0}},"name":"ROOT-14","size":8589934592,"path":"d59c6574-8ff9-41e4-86e5-ce560f30d717","volumeId":14,"vmName":"i-2-14-VM","accountId":2,"format":"QCOW2","id":14,"hypervisorType":"KVM"},"parentSnapshotPath":"/primary/d59c6574-8ff9-41e4-86e5-ce560f30d717/ab317705-7368-4a40-9d1c-da2c8a7b1824","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"2c8e7b93-2d02-4c47-99ce-7bcd8670554a","id":2,"poolType":"SharedMountPoint","host":"localhost","path":"/primary","port":0}},"vmName":"i-2-14-VM","name":"t1_ROOT-14_20131023172950","hypervisorType":"KVM","id":33}},"destTO":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/2/14","volume":{"uuid":"02c07659-59d3-42f2-8928-1d899cef94e7","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"2c8e7b93-2d02-4c47-99ce-7bcd8670554a","id":2,"poolType":"SharedMountPoint","host":"localhost","path":"/primary","port":0}},"name":"ROOT-14","size":8589934592,"path":"d59c6574-8ff9-41e4-86e5-ce560f30d717","volumeId":14,"vmName":"i-2-14-VM","accountId":2,"format":"QCOW2","id":14,"hypervisorType":"KVM"},"parentSnapshotPath":"snapshots/2/14/ab317705-7368-4a40-9d1c-da2c8a7b1824","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://192.168.10.31/export/secondary","_role":"Image"}},"vmName":"i-2-14-VM","name":"t1_ROOT-14_20131023172950","hypervisorType":"KVM","id":33}},"executeInSequence":true,"wait":21600}}]
> }
> 2013-10-23 20:31:21,560 DEBUG [agent.transport.Request]
> (AgentManager-Handler-8:null) Seq 6-1170407434: Processing: { Ans: , MgmtId:
> 161342718518, via: 6, Ver: v1, Flags: 110,
> [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":false,"details":"org.libvirt.LibvirtException:
> Domain snapshot not found: no domain snapshot with matching name
> '65113136-dfb5-4cea-8e65-1065462ca2fe'","wait":0}}] }
> 2013-10-23 20:31:21,832 DEBUG [storage.snapshot.SnapshotManagerImpl]
> (Job-Executor-49:job-91 = [ e2bf2454-4273-4a89-bc38-35add8297eb1 ]) Failed to
> create snapshot
> com.cloud.utils.exception.CloudRuntimeException:
> org.libvirt.LibvirtException: Domain snapshot not found: no domain snapshot
> with matching name '65113136-dfb5-4cea-8e65-1065462ca2fe'
> at
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:280)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:138)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:264)
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1013)
> at
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:170)
> 2013-10-23 20:31:21,999 DEBUG [storage.volume.VolumeServiceImpl]
> (Job-Executor-49:job-91 = [ e2bf2454-4273-4a89-bc38-35add8297eb1 ]) Take
> snapshot: 18 failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1040)
> at
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:170)
> Caused by: com.cloud.utils.exception.CloudRuntimeException:
> org.libvirt.LibvirtException: Domain snapshot not found: no domain snapshot
> with matching name '65113136-dfb5-4cea-8e65-1065462ca2fe'
> at
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:280)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:138)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:264)
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1013)
> 2013-10-23 20:31:22,167 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-49:job-91 = [ e2bf2454-4273-4a89-bc38-35add8297eb1 ]) Complete
> async job-91 = [ e2bf2454-4273-4a89-bc38-35add8297eb1 ], jobStatus: 2,
> resultCode: 530, result: Error Code: 530 Error text: Failed to create
> snapshot due to an internal error creating snapshot for volume 18
> 2013-10-23 20:31:24,709 DEBUG [agent.transport.Request]
> (AgentManager-Handler-13:null) Seq 9-1437990929: Processing: { Ans: ,
> MgmtId: 161342718518, via: 9, Ver: v1, Flags: 110,
> [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"newData":{"org.apache.cloudstack.storage.to.SnapshotObjectTO":{"path":"snapshots/2/16/157016cb-5e57-428f-b747-5d9b628d2864","id":0}},"result":true,"wait":0}}]
> }
> 2013-10-23 20:31:25,760 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-51:job-93 = [ 25e157c0-f966-401e-9263-c42dac56e0c1 ]) Done
> executing org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd
> for job-93 = [ 25e157c0-f966-401e-9263-c42dac56e0c1 ]
> 2013-10-23 20:32:57,416 DEBUG [agent.transport.Request]
> (AgentManager-Handler-8:null) Seq 6-1170407435: Processing: { Ans: , MgmtId:
> 161342718518, via: 6, Ver: v1, Flags: 110,
> [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":false,"details":"org.libvirt.LibvirtException:
> Domain snapshot not found: no domain snapshot with matching name
> '909848a0-b3ec-4657-a53a-c449dc24365b'","wait":0}}] }
> 2013-10-23 20:32:57,680 DEBUG [storage.snapshot.SnapshotManagerImpl]
> (Job-Executor-50:job-92 = [ b8bbb5be-54ba-43df-b429-5b5fb61416ad ]) Failed to
> create snapshot
> com.cloud.utils.exception.CloudRuntimeException:
> org.libvirt.LibvirtException: Domain snapshot not found: no domain snapshot
> with matching name '909848a0-b3ec-4657-a53a-c449dc24365b'
> at
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:280)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:138)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:264)
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1013)
> at
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:170)
> 2013-10-23 20:32:57,763 DEBUG [storage.volume.VolumeServiceImpl]
> (Job-Executor-50:job-92 = [ b8bbb5be-54ba-43df-b429-5b5fb61416ad ]) Take
> snapshot: 17 failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1040)
> at
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:170)
> Caused by: com.cloud.utils.exception.CloudRuntimeException:
> org.libvirt.LibvirtException: Domain snapshot not found: no domain snapshot
> with matching name '909848a0-b3ec-4657-a53a-c449dc24365b'
> at
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:280)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:138)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:264)
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1013)
> 2013-10-23 20:32:57,849 DEBUG [cloud.async.AsyncJobManagerImpl]
> (Job-Executor-50:job-92 = [ b8bbb5be-54ba-43df-b429-5b5fb61416ad ]) Complete
> async job-92 = [ b8bbb5be-54ba-43df-b429-5b5fb61416ad ], jobStatus: 2,
> resultCode: 530, result: Error Code: 530 Error text: Failed to create
> snapshot due to an internal error creating snapshot for volume 17
> 2013-10-23 20:33:50,627 DEBUG [storage.snapshot.SnapshotSchedulerImpl]
> (SnapshotPollTask:null) Snapshot scheduler.poll is being called at 2013-10-23
> 17:33:50 GMT
> 2013-10-23 20:33:50,627 DEBUG [storage.snapshot.SnapshotSchedulerImpl]
> (SnapshotPollTask:null) Got 0 snapshots to be executed at 2013-10-23 17:33:50
> GMT
> 2013-10-23 20:34:28,514 DEBUG [agent.transport.Request]
> (AgentManager-Handler-3:null) Seq 6-1170407437: Processing: { Ans: , MgmtId:
> 161342718518, via: 6, Ver: v1, Flags: 110,
> [{"org.apache.cloudstack.storage.command.CopyCmdAnswer":{"result":false,"details":"org.libvirt.LibvirtException:
> Domain snapshot not found: no domain snapshot with matching name
> 'f4e51b11-ac79-4a6a-b887-8926ffbd5cca'","wait":0}}] }
> 2013-10-23 20:34:28,779 DEBUG [storage.snapshot.SnapshotManagerImpl]
> (Job-Executor-52:job-94 = [ 42f8d6e0-762e-4f01-a7d5-daff2e31be13 ]) Failed to
> create snapshot
> com.cloud.utils.exception.CloudRuntimeException:
> org.libvirt.LibvirtException: Domain snapshot not found: no domain snapshot
> with matching name 'f4e51b11-ac79-4a6a-b887-8926ffbd5cca'
> at
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:280)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:138)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:264)
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1013)
> at
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:170)
> 2013-10-23 20:34:28,870 DEBUG [storage.volume.VolumeServiceImpl]
> (Job-Executor-52:job-94 = [ 42f8d6e0-762e-4f01-a7d5-daff2e31be13 ]) Take
> snapshot: 14 failed
> com.cloud.utils.exception.CloudRuntimeException: Failed to create snapshot
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1040)
> at
> org.apache.cloudstack.api.command.user.snapshot.CreateSnapshotCmd.execute(CreateSnapshotCmd.java:170)
> Caused by: com.cloud.utils.exception.CloudRuntimeException:
> org.libvirt.LibvirtException: Domain snapshot not found: no domain snapshot
> with matching name 'f4e51b11-ac79-4a6a-b887-8926ffbd5cca'
> at
> org.apache.cloudstack.storage.snapshot.SnapshotServiceImpl.backupSnapshot(SnapshotServiceImpl.java:280)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.backupSnapshot(XenserverSnapshotStrategy.java:138)
> at
> org.apache.cloudstack.storage.snapshot.XenserverSnapshotStrategy.takeSnapshot(XenserverSnapshotStrategy.java:264)
> at
> com.cloud.storage.snapshot.SnapshotManagerImpl.takeSnapshot(SnapshotManagerImpl.java:1013)
--
This message was sent by Atlassian JIRA
(v6.1#6144)