[ 
https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17604#comment-17604
 ] 

Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------

Here are host logs. As suspected, I did not find any errors, the VM was 
suspended  at 09:06:47 MST and resumed at 09:17:26 MST with Thread-7164775 
returning successfully after copying 33498670080 bytes of RAM to the storage 
domain.

{quote}Thread-7164775::DEBUG::2016-06-23 
09:06:45,911::BindingXMLRPC::1133::vds::(wrapper) client [66.187.230.60]::call 
vmSnapshot with ('e7a7b735-0310-4f88-9ed9-4fed85835a01', [{'baseVolumeID': 
'f37836c6-4bbe-4c8d-abf4-275cf461262e', 'domainID': 
'ba023ff2-4e0e-4a32-86f3-923414206667', 'volumeID': 
'3b105e9b-53fe-4452-be71-2ac2182ecfec', 'imageID': 
'140adf46-fce4-4dba-980d-37d91416b12b'}], 
'ba023ff2-4e0e-4a32-86f3-923414206667,00000002-0002-0002-0002-000000000150,2beb0ee6-b70b-4f48-bdd9-d89650383d61,daef68b9-5967-4047-9b17-1f55b68e5d8a,3580f2a1-a55a-47d0-9e67-627afbc0f2da,6c20093d-a5f3-407a-8986-ca26a488cb20')
 {}
...
Thread-7164775::DEBUG::2016-06-23 09:06:47,459::vm::4432::vm.Vm::(snapshot) 
vmId=`e7a7b735-0310-4f88-9ed9-4fed85835a01`::<domainsnapshot>
        <disks>
                <disk name="vda" snapshot="external" type="file">
                        <source 
file="/rhev/data-center/00000002-0002-0002-0002-000000000150/ba023ff2-4e0e-4a32-86f3-923414206667/images/140adf46-fce4-4dba-980d-37d91416b12b/3b105e9b-53fe-4452-be71-2ac2182ecfec"
 type="file"/>
                </disk>
        </disks>
        <memory 
file="/rhev/data-center/00000002-0002-0002-0002-000000000150/ba023ff2-4e0e-4a32-86f3-923414206667/images/2beb0ee6-b70b-4f48-bdd9-d89650383d61/daef68b9-5967-4047-9b17-1f55b68e5d8a"
 snapshot="external"/>
</domainsnapshot>
...
libvirtEventLoop::DEBUG::2016-06-23 
09:06:47,645::vm::5571::vm.Vm::(_onLibvirtLifecycleEvent) 
vmId=`e7a7b735-0310-4f88-9ed9-4fed85835a01`::event Suspended detail 0 opaque 
None
...
Thread-7164775::DEBUG::2016-06-23 
09:17:26,338::outOfProcess::169::Storage.oop::(padToBlockSize) Truncating file 
/rhev/data-center/00000002-0002-0002-0002-000000000150/ba023ff2-4e0e-4a32-86f3-923414206667/images/2beb0ee6-b70
b-4f48-bdd9-d89650383d61/daef68b9-5967-4047-9b17-1f55b68e5d8a to 33498670080 
bytes
...
libvirtEventLoop::DEBUG::2016-06-23 
09:17:26,317::vm::5571::vm.Vm::(_onLibvirtLifecycleEvent) 
vmId=`e7a7b735-0310-4f88-9ed9-4fed85835a01`::event Resumed detail 0 opaque None
...
Thread-7164775::DEBUG::2016-06-23 
09:17:26,450::BindingXMLRPC::1140::vds::(wrapper) return vmSnapshot with 
{'status': {'message': 'Done', 'code': 0}, 'quiesce': False}
{quote}

On Engine the process timed out after 3 minutes and in reality it took 11 
minutes. This suggests the snapshot is likely completely healthy, I'll take a 
sosreport from the host just in case we need to further investigate this, maybe 
[~landgraf] can check the logs for more clues.

> Jenkins snapshot creation failed
> --------------------------------
>
>                 Key: OVIRT-609
>                 URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
>             Project: oVirt - virtualization made easy
>          Issue Type: Bug
>            Reporter: Evgheni Dereveanchin
>            Assignee: infra
>
> [[email protected]] issued a live snapshot creation on the Jenkins VM to 
> prepare it for cluster move. This failed and it's not really clear why. 
> Relevant event logs below, suggesting that the hypervisor  started dumping VM 
> memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for 
> VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 
> 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is 
> recommended. Note that using the created snapshot might cause data 
> inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the 
> defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 
> 18.7802 seconds from host ovirt-srv11. This may cause performance and 
> functional issues. Please consult your Storage Administrator.{quote}



--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
_______________________________________________
Infra mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/infra

Reply via email to