[libvirt-users] Canceling a live migration via virsh? (QEMU/KVM)
I am using QEMU/KVM, using Live Migrations like this: virsh migrate --live ${name} qemu+ssh://${DESTINATION}/system My question, running this command makes it hang in the foreground. Is there a way for this to return immediately, so I can just poll for the migration status? Also, is there a way to _cancel_ a migration? I see the --timeout option, however if a given timeout is reached I would rather have the ability to cancel the migration than force the suspend. I do see there is a QEMU api migrate_cancel..eg: virsh qemu-monitor-command ${name} --pretty '{execute:migrate_cancel}' Is that the only way to cancel a migration using libvirt? ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
Re: [libvirt-users] Canceling a live migration via virsh? (QEMU/KVM)
On 01/08/2014 09:44 AM, Eric Blake wrote: On 01/08/2014 07:46 AM, Scott Sullivan wrote: I am using QEMU/KVM, using Live Migrations like this: virsh migrate --live ${name} qemu+ssh://${DESTINATION}/system My question, running this command makes it hang in the foreground. Is there a way for this to return immediately, so I can just poll for the migration status? Not at the moment, but it might be worth adding a 'migrate --detach' flag for that purpose, then using job control commands to track progress independently. Also, is there a way to _cancel_ a migration? Hit Ctrl-C (or any other approach for sending SIGINT to virsh). I see the --timeout option, however if a given timeout is reached I would rather have the ability to cancel the migration than force the suspend. I do see there is a QEMU api migrate_cancel..eg: virsh qemu-monitor-command ${name} --pretty '{execute:migrate_cancel}' Is that the only way to cancel a migration using libvirt? That way is unsupported. The supported way (and the way used by ctrl-C during 'virsh migrate') is to call virDomainAbortJob(). Thanks Erik, very helpful response. ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
[libvirt-users] blockcopy, userspace iSCSI support?
Right now, on a virsh blockcopy, I know you can do something like this: # Connect DEST target iscsiadm -m node -p ${DESTINATION}:3260 -T ${VOLNAME} -o new iscsiadm -m node -p ${DESTINATION}:3260 -T ${VOLNAME} --login # Copy to connected iSCSI target virsh blockcopy ${DOMAIN} vda /dev/sdc --raw --bandwidth 300 However I have libiscsi compiled into my QEMU. So I can do this with the monitor directly (and avoid the need to call out to external iscsiadm): virsh qemu-monitor-command ${DOMAIN} '{execute:drive-mirror, arguments: { device: drive-virtio-disk0, target: iscsi://${TARGET_IP}:3260/${DOMAIN}/1, mode: existing, sync: full, on-source-error: stop, on-target-error: stop } }' Is there a way to use the libiscsi compiled into my QEMU with virsh blockcopy command? I haven't been able to find any examples of using blockcopy with iSCSI compiled into QEMU. ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
[libvirt-users] Can you verify currently defined libvirt secret provides valid Cephx auth?
As the subject suggests, I am wondering if its possible to verify if the currently defined libvirt secret provides valid authentication via Cephx to a Ceph cluster? I ask because ideally I would like to verify that the given cephx credentials in my libvirt secret are valid before I even attempt the virsh attach-device on the domain. I tried searching for a solution to this, but I can't seem to find a way to _just_ check if the currently defined libvirt secret provides valid authentication to the Ceph cluster or not. Is this possible? ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
Re: [libvirt-users] Can you verify currently defined libvirt secret provides valid Cephx auth?
I know I would still have to provide the Mon addresses and Cephx user to go with the secret UUID, that's fine. The primary goal is so I can just see if I can open a RADOS connection to the cluster before I try a virsh attach-device. On Tue, Feb 11, 2014 at 9:37 AM, Scott Sullivan scottgregorysulli...@gmail.com wrote: As the subject suggests, I am wondering if its possible to verify if the currently defined libvirt secret provides valid authentication via Cephx to a Ceph cluster? I ask because ideally I would like to verify that the given cephx credentials in my libvirt secret are valid before I even attempt the virsh attach-device on the domain. I tried searching for a solution to this, but I can't seem to find a way to _just_ check if the currently defined libvirt secret provides valid authentication to the Ceph cluster or not. Is this possible? ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
Re: [libvirt-users] How do you force a VM reboot (when its KPd etc) without interrupting a blockcopy?
On 01/16/2015 12:32 PM, Eric Blake wrote: On 01/16/2015 10:21 AM, Scott Sullivan wrote: My question is this: If you have an ongoing blockcopy (drive-mirror) of a running transient VM, and the VM kernel panics, can you restart the VM without interrupting the ongoing blockcopy? Sadly, this is not yet possible. There is work being done in upstream qemu to add persistent bitmap support, and the libvirt would have to be taught to drive it. The idea is that a persistent bitmap would allow you to resume a copy operation from where it left off, rather than having to start from scratch. But as that is not implemented yet, the best you can do is restart the copy from scratch. You can minimize the penalty of a copy, though, by being careful about what is being copied. If you take an external snapshot prior to starting a blockcopy, then you have a setup where the backing file is large and stable, while the delta qcow2 wrapper is still relatively small. Copy the backing file via external means, if you have something more efficient such as a storage pool cloner. Meanwhile, use a shallow blockcopy of just the qcow2 wrapper. If that has to be restarted, it is a much smaller file to redo; also, because it is smaller, a shallow blockcopy is likely to finish faster. Then, once everything is copied to the new location, use active blockcommit to merge the temporary wrapper back into the normal backing file. Thanks for the help Eric and Daniel. Here's what i'm seeing: [root@test-parent-kvm-2 ~]# virsh blockjob --info f20-SPICE vda Block Copy: [ 5 %] [root@test-parent-kvm-2 ~]# virsh reset f20-SPICE Domain f20-SPICE was reset [root@test-parent-kvm-2 ~]# virsh blockjob --info f20-SPICE vda Block Copy: [ 5 %] [root@test-parent-kvm-2 ~]# So as you see, it appears doing a virsh reset is not restarting the block copy progress (which would be good). I just wanted a way to force a VM to reboot if its KPd or something during a blockcopy. It appears that is exactly what virsh reset is doing, and it also appears to not be restarting the block copy progress (based on the paste above). This is good news, thanks to both of you Eric and Daniel , its appreciated. Also , do feel free to correct me if im mistaken with the above assessment. ___ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users
[libvirt-users] Libvirt 4.2.0 hang on destination on live migration cancel
I'm trying the following command on the source machine: virsh migrate --live --copy-storage-all --verbose TEST qemu+ssh:// 10.30.76.66/system If I ssh into the destination machine when this command is running, I can see NBD copying data as expected, and if I wait long enough it completes and succeeds. However if I ctrl+c this command above before it completes, it causes virsh commands (like virsh list) to just hang seemingly forever on the destination machine. Here's the libvirt debug entries when I run virsh list in this state: [root@host ~]# export LIBVIRT_DEBUG=1 [root@host ~]# virsh list 2018-10-02 05:14:05.053+: 16207: debug : virGlobalInit:359 : register drivers 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:657 : driver=0x7f70c1ddbb60 name=Test 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:668 : registering Test as driver 0 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:657 : driver=0x7f70c1ddc660 name=ESX 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:668 : registering ESX as driver 12018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:657 : driver=0x7f70c1ddd560 name=remote 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:668 : registering remote as driver 2 2018-10-02 05:14:05.054+: 16207: debug : virEventRegisterDefaultImpl:280 : registering default event implementation 2018-10-02 05:14:05.054+: 16207: debug : virEventPollAddHandle:115 : Used 0 handle slots, adding at least 10 more 2018-10-02 05:14:05.054+: 16207: debug : virEventPollInterruptLocked:722 : Skip interrupt, 0 0 2018-10-02 05:14:05.054+: 16207: info : virEventPollAddHandle:140 : EVENT_POLL_ADD_HANDLE: watch=1 fd=4 events=1 cb=0x7f70c18791c0 opaque=(nil) ff=(nil) 2018-10-02 05:14:05.054+: 16207: debug : virEventRegisterImpl:241 : addHandle=0x7f70c187a8a0 updateHandle=0x7f70c1879500 removeHandle=0x7f70c1879360 addTimeout=0x7f70c187a660 updateTimeout=0x7f70c1879690 removeTimeout=0x7f70c1879220 2018-10-02 05:14:05.054+: 16207: debug : virEventPollAddTimeout:230 : Used 0 timeout slots, adding at least 10 more2018-10-02 05:14:05.054+: 16207: debug : virEventPollInterruptLocked:722 : Skip interrupt, 0 0 2018-10-02 05:14:05.054+: 16207: info : virEventPollAddTimeout:253 : EVENT_POLL_ADD_TIMEOUT: timer=1 frequency=-1 cb=0x5559f8dba380 opaque=0x7ffe5b15e700 ff=(nil) 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenAuth:1218 : name=, auth=0x7f70c1ddea00, flags=0x0 2018-10-02 05:14:05.054+: 16207: info : virObjectNew:254 : OBJECT_NEW: obj=0x5559fa83a3f0 classname=virConnect 2018-10-02 05:14:05.054+: 16208: debug : virThreadJobSet:99 : Thread 16208 is now running job vshEventLoop 2018-10-02 05:14:05.054+: 16208: debug : virEventRunDefaultImpl:324 : running default event implementation 2018-10-02 05:14:05.054+: 16207: debug : virConfLoadConfig:1576 : Loading config file '/etc/libvirt/libvirt.conf' 2018-10-02 05:14:05.054+: 16208: debug : virEventPollCleanupTimeouts:525 : Cleanup 1 2018-10-02 05:14:05.054+: 16208: debug : virEventPollCleanupHandles:574 : Cleanup 1 2018-10-02 05:14:05.054+: 16207: debug : virConfReadFile:752 : filename=/etc/libvirt/libvirt.conf 2018-10-02 05:14:05.054+: 16208: debug : virEventPollMakePollFDs:401 : Prepare n=0 w=1, f=4 e=1 d=0 2018-10-02 05:14:05.054+: 16208: debug : virEventPollCalculateTimeout:338 : Calculate expiry of 1 timers 2018-10-02 05:14:05.054+: 16208: debug : virEventPollCalculateTimeout:371 : No timeout is pending 2018-10-02 05:14:05.054+: 16208: info : virEventPollRunOnce:640 : EVENT_POLL_RUN: nhandles=1 timeout=-1 2018-10-02 05:14:05.054+: 16207: debug : virFileClose:110 : Closed fd 6 2018-10-02 05:14:05.054+: 16207: debug : virConfGetValueString:897 : Get value string (nil) 0 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1013 : no name, allowing driver auto-select 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1056 : trying driver 0 (Test) ... 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1071 : driver 0 Test returned DECLINED 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1056 : trying driver 1 (ESX) ... 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1071 : driver 1 ESX returned DECLINED 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1056 : trying driver 2 (remote) ... 2018-10-02 05:14:05.054+: 16207: debug : remoteConnectOpen:1350 : Auto-probe remote URI 2018-10-02 05:14:05.054+: 16207: debug : doRemoteOpen:916 : proceeding with name = 2018-10-02 05:14:05.054+: 16207: debug : doRemoteOpen:925 : Connecting with transport 1 2018-10-02 05:14:05.054+: 16207: debug : doRemoteOpen:1060 : Proceeding with sockname /var/run/libvirt/libvirt-sock 2018-10-02 05:14:05.054+: 16207: debug :
Re: [libvirt-users] Libvirt 4.2.0 hang on destination on live migration cancel
For anyone who might find this thread from online searching, I was able to determine this commit fixes the above issue: https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=dddcb601ebf97ef222a03bb27b2357e831e8a0cc On Tue, Oct 2, 2018 at 8:42 AM Scott Sullivan < scottgregorysulli...@gmail.com> wrote: > I'm trying the following command on the source machine: > > virsh migrate --live --copy-storage-all --verbose TEST qemu+ssh:// > 10.30.76.66/system > > If I ssh into the destination machine when this command is running, I can > see NBD copying data as expected, and if I wait long enough it completes > and succeeds. > > However if I ctrl+c this command above before it completes, it causes > virsh commands (like virsh list) to just hang seemingly forever on the > destination machine. > > Here's the libvirt debug entries when I run virsh list in this state: > > [root@host ~]# export LIBVIRT_DEBUG=1 > [root@host ~]# virsh list > 2018-10-02 05:14:05.053+: 16207: debug : virGlobalInit:359 : register > drivers > 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:657 > : driver=0x7f70c1ddbb60 name=Test > 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:668 > : registering Test as driver 0 > 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:657 > : driver=0x7f70c1ddc660 name=ESX > 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:668 > : registering ESX as driver 12018-10-02 05:14:05.054+: 16207: debug : > virRegisterConnectDriver:657 : driver=0x7f70c1ddd560 name=remote > > 2018-10-02 05:14:05.054+: 16207: debug : virRegisterConnectDriver:668 > : registering remote as driver 2 > 2018-10-02 05:14:05.054+: 16207: debug : > virEventRegisterDefaultImpl:280 : registering default event implementation > 2018-10-02 05:14:05.054+: 16207: debug : virEventPollAddHandle:115 : > Used 0 handle slots, adding at least 10 more > 2018-10-02 05:14:05.054+: 16207: debug : > virEventPollInterruptLocked:722 : Skip interrupt, 0 0 > > 2018-10-02 05:14:05.054+: 16207: info : virEventPollAddHandle:140 : > EVENT_POLL_ADD_HANDLE: watch=1 fd=4 events=1 cb=0x7f70c18791c0 opaque=(nil) > ff=(nil) > 2018-10-02 05:14:05.054+: 16207: debug : virEventRegisterImpl:241 : > addHandle=0x7f70c187a8a0 updateHandle=0x7f70c1879500 > removeHandle=0x7f70c1879360 addTimeout=0x7f70c187a660 > updateTimeout=0x7f70c1879690 removeTimeout=0x7f70c1879220 > 2018-10-02 05:14:05.054+: 16207: debug : virEventPollAddTimeout:230 : > Used 0 timeout slots, adding at least 10 more2018-10-02 05:14:05.054+: > 16207: debug : virEventPollInterruptLocked:722 : Skip interrupt, 0 0 > 2018-10-02 05:14:05.054+: 16207: info : virEventPollAddTimeout:253 : > EVENT_POLL_ADD_TIMEOUT: timer=1 frequency=-1 cb=0x5559f8dba380 > opaque=0x7ffe5b15e700 ff=(nil) > 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenAuth:1218 : > name=, auth=0x7f70c1ddea00, flags=0x0 > 2018-10-02 05:14:05.054+: 16207: info : virObjectNew:254 : OBJECT_NEW: > obj=0x5559fa83a3f0 classname=virConnect > 2018-10-02 05:14:05.054+: 16208: debug : > virThreadJobSet:99 : Thread 16208 is now running job vshEventLoop > 2018-10-02 05:14:05.054+: 16208: debug : virEventRunDefaultImpl:324 : > running default event implementation > 2018-10-02 05:14:05.054+: 16207: debug : virConfLoadConfig:1576 : > Loading config file '/etc/libvirt/libvirt.conf' > 2018-10-02 05:14:05.054+: 16208: debug : > virEventPollCleanupTimeouts:525 : Cleanup 1 > 2018-10-02 05:14:05.054+: 16208: debug : > virEventPollCleanupHandles:574 : Cleanup 1 > 2018-10-02 05:14:05.054+: 16207: debug : virConfReadFile:752 : > filename=/etc/libvirt/libvirt.conf > 2018-10-02 05:14:05.054+: 16208: debug : virEventPollMakePollFDs:401 : > Prepare n=0 w=1, f=4 e=1 d=0 > 2018-10-02 05:14:05.054+: 16208: debug : > virEventPollCalculateTimeout:338 : Calculate expiry of 1 timers > 2018-10-02 05:14:05.054+: 16208: debug : > virEventPollCalculateTimeout:371 : No timeout is pending > 2018-10-02 05:14:05.054+: 16208: info : virEventPollRunOnce:640 : > EVENT_POLL_RUN: nhandles=1 timeout=-1 > 2018-10-02 05:14:05.054+: 16207: debug : virFileClose:110 : Closed fd 6 > 2018-10-02 05:14:05.054+: 16207: debug : virConfGetValueString:897 : > Get value string (nil) 0 > 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1013 : > no name, allowing driver auto-select > 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1056 : > trying driver 0 (Test) ... > 2018-10-02 05:14:05.054+: 16207: debug : virConnectOpenInternal:1071 : > driver 0 Test returned DECLINED > 2018-10-02 05:14:05.054+00
[libvirt-users] Intermittent live migration hang with ceph RBD attached volume
Software in use: *Source hypervisor:* *Qemu:* stable-2.12 branch *Libvirt*: v3.2-maint branch *OS*: CentOS 6 *Destination hypervisor: **Qemu:* stable-2.12 branch *Libvirt*: v4.9-maint branch *OS*: CentOS 7 I'm experiencing an intermittent live migration hang of a virtual machine (KVM) with a ceph RBD volume attached. At the high level what I see is that when this does happen, the virtual machine is left in a paused state (per virsh list) on both source and destination hypervisors indefinitely. Here's the virsh command I am running on the source (where 10.30.76.66 is the destination hypervisor): virsh migrate --live --copy-storage-all --verbose --xml > /root/live_migration.cfg test_vm qemu+ssh://10.30.76.66/system tcp:// > 10.30.76.66 Here it is in "ps faux" while its in the hung state: root 10997 0.3 0.0 380632 6156 ?Sl 12:24 0:26 > \_ virsh migrate --live --copy-storage-all --verbose --xml > /root/live_migration.cfg test_vm qemu+ssh://10.30.76.66/sys > root 10999 0.0 0.0 60024 4044 ?S12:24 0:00 >\_ ssh 10.30.76.66 sh -c 'if 'nc' -q 2>&1 | grep "requires an argument" > >/dev/null 2>&1; then ARG=-q0;else ARG=;fi;'nc' $ARG -U The only reason i'm using the `--xml` arg is so the auth information can be updated for the new hypervisor (I setup a cephx user for each hypervisor). Below is a diff between my normal xml config and the one I passed in --xml arg to illustrate: 60,61c60,61 > < >> --- > > > >uuid="72e9373d-7101-4a93-a7d2-6cce5ec1e6f1"/> The libvirt secret as shown above is properly setup with good credentials on both source and destination hypervisors. When this happens, I don't see anything logged on the destination hypervisor in the libvirt log. However in the source hypervisors log, I do see this: 2019-06-21 12:38:21.004+: 28400: warning : > qemuDomainObjEnterMonitorInternal:3764 : This thread seems to be the async > job owner; entering monitor without asking for a nested job is dangerous But nothing else logged in the libvirt log on either source or destination. The actual `virsh migrate --live` command pasted above still runs while stuck in this state, and it just outputs "Migration: [100 %]" over and over. If I strace the qemu process on the source, I see this over and over: ppoll([{fd=9, events=POLLIN}, {fd=8, events=POLLIN}, {fd=4, events=POLLIN}, > {fd=6, events=POLLIN}, {fd=15, events=POLLIN}, {fd=18, events=POLLIN}, > {fd=19, events=POLLIN}, {fd=35, events=0}, {fd=35, events=POLLIN}], 9, {0, > 14960491}, NULL, 8) = 0 (Timeout) Here's those fds: [root@source ~]# ll /proc/31804/fd/{8,4,6,15,18,19,35} > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/15 -> socket:[931291] > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/18 -> socket:[931295] > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/19 -> socket:[931297] > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/35 -> socket:[931306] > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/4 -> [signalfd] > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/6 -> [eventfd] > lrwx-- 1 qemu qemu 64 Jun 21 13:18 /proc/31804/fd/8 -> [eventfd] > [root@source ~]# > > [root@source ~]# grep -E '(931291|931295|931297|931306)' /proc/net/tcp >3: :170C : 0A : 00: > 1070 931295 1 88043a27f840 99 0 0 10 -1 > >4: :170D : 0A : 00: > 1070 931297 1 88043a27f140 99 0 0 10 -1 > > [root@source ~]# Further, on the source, if I query the blockjobs status, it says no blockjob is running: [root@source ~]# virsh list > IdName State > > 11test_vm paused > [root@source ~]# virsh blockjob 11 vda > No current block job for vda > [root@source ~]# and that nc/ssh connection is still ok in the hung state: [root@source~]# netstat -tuapn|grep \.66 > tcp0 0 10.30.76.48:48876 10.30.76.66:22 >ESTABLISHED 10999/ssh > [root@source ~]# > root 10999 0.0 0.0 60024 4044 ?S12:24 0:00 >\_ ssh 10.30.76.66 sh -c 'if 'nc' -q 2>&1 | grep "requires an argument" > >/dev/null 2>&1; then ARG=-q0;else ARG=;fi;'nc' $ARG -U > /var/run/libvirt/libvirt-sock' Here's the state of the migration on source while its stuck like this: [root@source ~]# virsh qemu-monitor-command 11 '{"execute":"query-migrate"}' > > {"return":{"status":"completed","setup-time":2,"downtime":2451,"total-time":3753,"ram":{"total":2114785280,"postcopy-requests":0,"dirty-sync-count":3,"page-size":4096,"remaining":0,"mbps":898.199209,"transferred":421345514,"duplicate":414940,"dirty-pages-rate":0,"skipped":0,"normal-bytes":416796672,"normal":101757}},"id":"libvirt-317"} > [root@source ~]# I'm unable the run the above command on the