VM's are identical, same template, same cpu/mem/nic. Server type, thin provisioned on NFS (backend is glusterfs 3.4).
Does monitor = spice console? I don't believe either of them had a spice connection. I don't see anything in the ovirt001 sanlock.log: 2014-02-14 11:16:05-0500 255246 [5111]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:16:05-0500 255246 [5111]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:16:15-0500 255256 [5110]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:16:15-0500 255256 [5110]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:16:25-0500 255266 [5111]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:16:25-0500 255266 [5111]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:16:36-0500 255276 [5110]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:16:36-0500 255276 [5110]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:16:46-0500 255286 [5111]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:16:46-0500 255286 [5111]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:16:56-0500 255296 [5110]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:16:56-0500 255296 [5110]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:17:06-0500 255306 [5111]: cmd_inq_lockspace 4,14 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:17:06-0500 255306 [5111]: cmd_inq_lockspace 4,14 done 0 2014-02-14 11:17:06-0500 255307 [5105]: cmd_register ci 4 fd 14 pid 31132 2014-02-14 11:17:06-0500 255307 [5105]: cmd_restrict ci 4 fd 14 pid 31132 flags 1 2014-02-14 11:17:16-0500 255316 [5110]: cmd_inq_lockspace 5,15 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:17:16-0500 255316 [5110]: cmd_inq_lockspace 5,15 done 0 2014-02-14 11:17:26-0500 255326 [5111]: cmd_inq_lockspace 5,15 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:17:26-0500 255326 [5111]: cmd_inq_lockspace 5,15 done 0 2014-02-14 11:17:26-0500 255326 [5110]: cmd_acquire 4,14,31132 ci_in 5 fd 15 count 0 2014-02-14 11:17:26-0500 255326 [5110]: cmd_acquire 4,14,31132 result 0 pid_dead 0 2014-02-14 11:17:26-0500 255326 [5111]: cmd_acquire 4,14,31132 ci_in 6 fd 16 count 0 2014-02-14 11:17:26-0500 255326 [5111]: cmd_acquire 4,14,31132 result 0 pid_dead 0 2014-02-14 11:17:36-0500 255336 [5110]: cmd_inq_lockspace 5,15 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:17:36-0500 255336 [5110]: cmd_inq_lockspace 5,15 done 0 2014-02-14 11:17:39-0500 255340 [5105]: cmd_register ci 5 fd 15 pid 31319 2014-02-14 11:17:39-0500 255340 [5105]: cmd_restrict ci 5 fd 15 pid 31319 flags 1 2014-02-14 11:17:39-0500 255340 [5105]: client_pid_dead 5,15,31319 cmd_active 0 suspend 0 2014-02-14 11:17:46-0500 255346 [5111]: cmd_inq_lockspace 5,15 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:17:46-0500 255346 [5111]: cmd_inq_lockspace 5,15 done 0 2014-02-14 11:17:56-0500 255356 [5110]: cmd_inq_lockspace 5,15 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 2014-02-14 11:17:56-0500 255356 [5110]: cmd_inq_lockspace 5,15 done 0 2014-02-14 11:18:06-0500 255366 [5111]: cmd_inq_lockspace 5,15 a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0 flags 0 ovirt002 sanlock.log has on entries during that time frame. *Steve Dainard * IT Infrastructure Manager Miovision <http://miovision.com/> | *Rethink Traffic* *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/company/miovision-technologies> | Twitter <https://twitter.com/miovision> | Facebook <https://www.facebook.com/miovision>* ------------------------------ Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON, Canada | N2C 1L3 This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. On Mon, Feb 17, 2014 at 12:59 PM, Dafna Ron <[email protected]> wrote: > mmm... that is very interesting... > both vm's are identical? are they server or desktops type? created as thin > copy or clone? what storage type are you using? did you happen to have an > open monitor on the vm that failed migration? > I wonder if it can be sanlock lock on the source template but I can only > see this bug happening if the vm's are linked to the template > can you look at the sanlock log and see if there are any warning or errors? > > All logs are in debug so I don't think we can get anything more from it > but I am adding Meital and Omer to this mail to help debug this - perhaps > they can think of something that can cause that from the trace. > > This case is really interesting... sorry, probably not what you want to > hear... thanks for helping with this :) > > Dafna > > > > On 02/17/2014 05:08 PM, Steve Dainard wrote: > >> Failed live migration is wider spread than these two VM's, but they are a >> good example because they were both built from the same template and have >> no modifications after they were created. They were also migrated one after >> the other, with one successfully migrating and the other not. >> >> Are there any increased logging levels that might help determine what the >> issue is? >> >> Thanks, >> >> *Steve Dainard * >> IT Infrastructure Manager >> Miovision <http://miovision.com/> | /Rethink Traffic/ >> >> *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/ >> company/miovision-technologies> | Twitter <https://twitter.com/miovision> >> | Facebook <https://www.facebook.com/miovision>* >> ------------------------------------------------------------------------ >> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, >> ON, Canada | N2C 1L3 >> This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately. >> >> >> On Mon, Feb 17, 2014 at 11:47 AM, Dafna Ron <[email protected] <mailto: >> [email protected]>> wrote: >> >> did you install these vm's from a cd? run it as run-once with a >> special monitor? >> try to think if there is anything different in the configuration >> of these vm's from the other vm's that succeed to migrate? >> >> >> On 02/17/2014 04:36 PM, Steve Dainard wrote: >> >> Hi Dafna, >> >> No snapshots of either of those VM's have been taken, and >> there are no updates for any of those packages on EL 6.5. >> >> *Steve Dainard * >> IT Infrastructure Manager >> Miovision <http://miovision.com/> | /Rethink Traffic/ >> >> *Blog <http://miovision.com/blog> | **LinkedIn >> <https://www.linkedin.com/company/miovision-technologies> | >> Twitter <https://twitter.com/miovision> | Facebook >> <https://www.facebook.com/miovision>* >> ------------------------------------------------------------ >> ------------ >> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, >> Kitchener, ON, Canada | N2C 1L3 >> This e-mail may contain information that is privileged or >> confidential. If you are not the intended recipient, please >> delete the e-mail and any attachments and notify us immediately. >> >> >> On Sun, Feb 16, 2014 at 7:05 AM, Dafna Ron <[email protected] >> <mailto:[email protected]> <mailto:[email protected] >> <mailto:[email protected]>>> wrote: >> >> does the vm that fails migration have a live snapshot? >> if so how many snapshots does the vm have. >> I think that there are newer packages of vdsm, libvirt and >> qemu - >> can you try to update >> >> >> >> On 02/16/2014 12:33 AM, Steve Dainard wrote: >> >> Versions are the same: >> >> [root@ovirt001 ~]# rpm -qa | egrep 'libvirt|vdsm|qemu' >> | sort >> gpxe-roms-qemu-0.9.7-6.10.el6.noarch >> libvirt-0.10.2-29.el6_5.3.x86_64 >> libvirt-client-0.10.2-29.el6_5.3.x86_64 >> libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64 >> libvirt-python-0.10.2-29.el6_5.3.x86_64 >> qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64 >> qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64 >> qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64 >> vdsm-4.13.3-3.el6.x86_64 >> vdsm-cli-4.13.3-3.el6.noarch >> vdsm-gluster-4.13.3-3.el6.noarch >> vdsm-python-4.13.3-3.el6.x86_64 >> vdsm-xmlrpc-4.13.3-3.el6.noarch >> >> [root@ovirt002 ~]# rpm -qa | egrep 'libvirt|vdsm|qemu' >> | sort >> gpxe-roms-qemu-0.9.7-6.10.el6.noarch >> libvirt-0.10.2-29.el6_5.3.x86_64 >> libvirt-client-0.10.2-29.el6_5.3.x86_64 >> libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64 >> libvirt-python-0.10.2-29.el6_5.3.x86_64 >> qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64 >> qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64 >> qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64 >> vdsm-4.13.3-3.el6.x86_64 >> vdsm-cli-4.13.3-3.el6.noarch >> vdsm-gluster-4.13.3-3.el6.noarch >> vdsm-python-4.13.3-3.el6.x86_64 >> vdsm-xmlrpc-4.13.3-3.el6.noarch >> >> Logs attached, thanks. >> >> *Steve Dainard * >> IT Infrastructure Manager >> Miovision <http://miovision.com/> | /Rethink Traffic/ >> >> *Blog <http://miovision.com/blog> | **LinkedIn >> <https://www.linkedin.com/ >> company/miovision-technologies> | >> Twitter <https://twitter.com/miovision> | Facebook >> <https://www.facebook.com/miovision>* >> ------------------------------ >> ------------------------------------------ >> Miovision Technologies Inc. | 148 Manitou Drive, Suite >> 101, >> Kitchener, ON, Canada | N2C 1L3 >> This e-mail may contain information that is privileged or >> confidential. If you are not the intended recipient, >> please >> delete the e-mail and any attachments and notify us >> immediately. >> >> >> On Sat, Feb 15, 2014 at 6:24 AM, Dafna Ron >> <[email protected] <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>> >> <mailto:[email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>>> >> wrote: >> >> the migration fails in libvirt: >> >> >> Thread-153709::ERROR::2014-02-14 >> 11:17:40,420::vm::337::vm.Vm::(run) >> >> vmId=`08434c90-ffa3-4b63-aa8e-5613f7b0e0cd`::Failed >> to migrate >> Traceback (most recent call last): >> File "/usr/share/vdsm/vm.py", line 323, in run >> self._startUnderlyingMigration() >> File "/usr/share/vdsm/vm.py", line 403, in >> _startUnderlyingMigration >> None, maxBandwidth) >> File "/usr/share/vdsm/vm.py", line 841, in f >> ret = attr(*args, **kwargs) >> File >> "/usr/lib64/python2.6/site- >> packages/vdsm/libvirtconnection.py", >> line 76, in wrapper >> ret = f(*args, **kwargs) >> File >> "/usr/lib64/python2.6/site-packages/libvirt.py", >> line 1178, >> in migrateToURI2 >> if ret == -1: raise libvirtError >> ('virDomainMigrateToURI2() >> failed', dom=self) >> libvirtError: Unable to read from monitor: Connection >> reset by peer >> Thread-54041::DEBUG::2014-02-14 >> 11:17:41,752::task::579::TaskManager.Task::(_ >> updateState) >> >> Task=`094c412a-43dc-4c29-a601-d759486469a8`::moving >> from state >> init -> state preparing >> Thread-54041::INFO::2014-02-14 >> 11:17:41,753::logUtils::44::dispatcher::(wrapper) >> Run and >> protect: >> getVolumeSize(sdUUID='a52938f7-2cf4-4771-acb2- >> 0c78d14999e5', >> spUUID='fcb89071-6cdb-4972-94d1-c9324cebf814', >> imgUUID='97c9108f-a506-415f-ad2 >> c-370d707cb130', >> volUUID='61f82f7f-18e4-4ea8-9db3-71ddd9d4e836', >> options=None) >> >> Do you have the same libvirt/vdsm/qemu on both >> your hosts? >> Please attach the libvirt and vm logs from both hosts. >> >> Thanks, >> Dafna >> >> >> >> On 02/14/2014 04:50 PM, Steve Dainard wrote: >> >> Quick overview: >> Ovirt 3.3.2 running on CentOS 6.5 >> Two hosts: ovirt001, ovirt002 >> Migrating two VM's: puppet-agent1, >> puppet-agent2 from >> ovirt002 >> to ovirt001. >> >> The first VM puppet-agent1 migrates >> successfully. The >> second >> VM puppet-agent2 fails with "Migration failed >> due to >> Error: >> Fatal error during migration (VM: >> puppet-agent2, Source: >> ovirt002, Destination: ovirt001)." >> >> I've attached the logs if anyone can help me track >> down the issue. >> >> Thanks, >> >> *Steve Dainard * >> IT Infrastructure Manager >> Miovision <http://miovision.com/> | /Rethink >> Traffic/ >> >> *Blog <http://miovision.com/blog> | **LinkedIn >> <https://www.linkedin.com/ >> company/miovision-technologies> | >> Twitter <https://twitter.com/miovision> | >> Facebook >> <https://www.facebook.com/miovision>* >> ------------------------------ >> ------------------------------------------ >> >> >> Miovision Technologies Inc. | 148 Manitou >> Drive, Suite >> 101, >> Kitchener, ON, Canada | N2C 1L3 >> This e-mail may contain information that is >> privileged or >> confidential. If you are not the intended >> recipient, >> please >> delete the e-mail and any attachments and >> notify us >> immediately. >> >> >> _______________________________________________ >> Users mailing list >> [email protected] <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>> >> <mailto:[email protected] <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>>> >> >> >> http://lists.ovirt.org/mailman/listinfo/users >> >> >> >> -- Dafna Ron >> >> >> >> >> -- Dafna Ron >> >> >> >> >> -- Dafna Ron >> >> >> > > -- > Dafna Ron >
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

