Re: [pve-devel] [PATCH qemu-server 3/4] move finish_tunnel to after the VM is stopped

2020-03-11 Thread Fabian Grünbichler
On March 11, 2020 2:26 pm, Fabian Grünbichler wrote:
> On March 11, 2020 11:44 am, Mira Limbeck wrote:
>> The NBD socket on the client side is kept open until the VM stops. In
>> the case of TCP this was not a problem, because we did not rely on it
>> closing. But with unix socket forwarding via SSH the SSH connection
>> can't close if the socket is still open.
>> 
>> If it does not close by itself, it will be terminated after 30 seconds.
> 
> in which combinations did you test this? it prints pretty ugly warnings 
> here for a VM with a single, local disk:

okay, so this seems to be caused by some changes related to 4.2, which 
only get exposed easier with this patch set. anyway, the following PoC 
seems to get rid of them, and actually seems like a good idea in general 
(we want to cancel the ready block jobs and not switch over to the 
target/NBD disks!), but there might be some more error handling required 
to get it right in all cases (and a better name for the parameter ;)).

we could check all call sites and maybe change $skipcomplete to be an 
enum?

0 => wait until ready, complete
1 => wait until ready, return
2 => wait until ready, cancel

with a possible re-ordering in case migration is the only thing setting 
skipcomplete currently ;)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index a52746c..f42dee1 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -990,7 +990,7 @@ sub phase3_cleanup {
 
 if ($self->{storage_migration}) {
# finish block-job
-   eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, 
$self->{storage_migration_jobs}); };
+   eval { PVE::QemuServer::qemu_drive_mirror_monitor($vmid, undef, 
$self->{storage_migration_jobs}, 0, 0, 1); };
 
if (my $err = $@) {
eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, 
$self->{storage_migration_jobs}) };
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 3d3505c..3c93a0a 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6356,7 +6356,7 @@ sub qemu_drive_mirror {
 }
 
 sub qemu_drive_mirror_monitor {
-my ($vmid, $vmiddst, $jobs, $skipcomplete, $qga) = @_;
+my ($vmid, $vmiddst, $jobs, $skipcomplete, $qga, $cancelcomplete) = @_;
 
 eval {
my $err_complete = 0;
@@ -6426,12 +6426,13 @@ sub qemu_drive_mirror_monitor {
 
last;
} else {
+   $cancelcomplete //= 0;
 
foreach my $job (keys %$jobs) {
# try to switch the disk if source and destination are 
on the same guest
print "$job: Completing block job...\n";
 
-   eval { mon_cmd($vmid, "block-job-complete", device => 
$job) };
+   eval { mon_cmd($vmid, $cancelcomplete ? 
"block-job-cancel" : "block-job-complete", device => $job) };
if ($@ =~ m/cannot be completed/) {
print "$job: Block job cannot be completed, try 
again.\n";
$err_complete++;

> 
> new to old (TCP, but with stop before tunnel close):
> 
> 2020-03-11 12:52:21 migration status: completed
> drive-scsi0: transferred: 17179869184 bytes remaining: 0 bytes total: 
> 17179869184 bytes progression: 100.00 % busy: 0 ready: 1
> all mirroring jobs are ready
> drive-scsi0: Completing block job...
> drive-scsi0: Completed successfully.
> drive-scsi0 : finished
> VM quit/powerdown failed - terminating now with SIGTERM
> VM still running - terminating now with SIGKILL
> 2020-03-11 12:52:38 migration finished successfully (duration 00:00:46)
> 
> I haven't checked/verified, but I think the problem is that the source 
> VM still has the NBD connection open, and when you call 'quit', it will 
> attempt to flush the NBD connection and block on that (I ran into a 
> similar problem with my websocket tunnel).
> 
> new to new (UNIX, stop before tunnel close):
> 
> 2020-03-11 12:51:25 migration status: completed
> drive-scsi0: transferred: 17179869184 bytes remaining: 0 bytes total: 
> 17179869184 bytes progression: 100.00 % busy: 0 ready: 1
> all mirroring jobs are ready
> drive-scsi0: Completing block job...
> drive-scsi0: Completed successfully.
> drive-scsi0 : finished
> channel 4: open failed: connect failed: open failed
> channel 3: open failed: connect failed: open failed
> channel 3: open failed: connect failed: open failed
> channel 3: open failed: connect failed: open failed
> channel 3: open failed: connect failed: open failed
> 2020-03-11 12:51:43 migration finished successfully (duration 00:01:22)
> 
> if I interpret the strace correctly, here's what happens:
> 
> 1. nbd connection gets closed from remote end (not directly visible in 
> strace, since that happens with another ssh call)
> 2. local VM connects again to the UNIX socket (it should be paused at 
> this moment?)
> 3. ssh accept()s connection and attempts to forward it
> 4. other end does not connect
> 5. error is returned to source node
> 
> 

Re: [pve-devel] [PATCH qemu-server 3/4] move finish_tunnel to after the VM is stopped

2020-03-11 Thread Fabian Grünbichler
On March 11, 2020 11:44 am, Mira Limbeck wrote:
> The NBD socket on the client side is kept open until the VM stops. In
> the case of TCP this was not a problem, because we did not rely on it
> closing. But with unix socket forwarding via SSH the SSH connection
> can't close if the socket is still open.
> 
> If it does not close by itself, it will be terminated after 30 seconds.

in which combinations did you test this? it prints pretty ugly warnings 
here for a VM with a single, local disk:

new to old (TCP, but with stop before tunnel close):

2020-03-11 12:52:21 migration status: completed
drive-scsi0: transferred: 17179869184 bytes remaining: 0 bytes total: 
17179869184 bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0 : finished
VM quit/powerdown failed - terminating now with SIGTERM
VM still running - terminating now with SIGKILL
2020-03-11 12:52:38 migration finished successfully (duration 00:00:46)

I haven't checked/verified, but I think the problem is that the source 
VM still has the NBD connection open, and when you call 'quit', it will 
attempt to flush the NBD connection and block on that (I ran into a 
similar problem with my websocket tunnel).

new to new (UNIX, stop before tunnel close):

2020-03-11 12:51:25 migration status: completed
drive-scsi0: transferred: 17179869184 bytes remaining: 0 bytes total: 
17179869184 bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0 : finished
channel 4: open failed: connect failed: open failed
channel 3: open failed: connect failed: open failed
channel 3: open failed: connect failed: open failed
channel 3: open failed: connect failed: open failed
channel 3: open failed: connect failed: open failed
2020-03-11 12:51:43 migration finished successfully (duration 00:01:22)

if I interpret the strace correctly, here's what happens:

1. nbd connection gets closed from remote end (not directly visible in 
strace, since that happens with another ssh call)
2. local VM connects again to the UNIX socket (it should be paused at 
this moment?)
3. ssh accept()s connection and attempts to forward it
4. other end does not connect
5. error is returned to source node

3 is the SSH control channel, 4 and 5 are the UNIX sockets on the source 
node (nbd and memory), 6,7,8 are STDIN, STDOUT, STDERR, 9 and 10 are the 
accept-ed() client connections for the UNIX sockets (nbd and memory). 
note the shutdown and close for 10 and 9, followed by another accept on 
4 with failure to connect, followed by the resume command via the tunnel 
and its reply.

select(11, [3 4 5 6 9 10], [3], NULL, {tv_sec=300, tv_usec=0}) = 2 (in [10], 
out [3], left {tv_sec=299, tv_usec=98})
read(10, "ffer\", \"size\" : 21 }, { \"name\" :"..., 16384) = 3100
write(3, 
"\0\0@\20\337\271Q\240\6DI\267]u\301\330\345\2632\341j\240\223\341Z\202\350r\220\261Q\304"...,
 16412) = 16412
getpid()= 1540056
select(11, [3 4 5 6 9 10], [3], NULL, {tv_sec=300, tv_usec=0}) = 2 (in [10], 
out [3], left {tv_sec=299, tv_usec=98})
read(10, "", 16384) = 0
shutdown(10, SHUT_RD)   = 0
write(3, 
"\0\0\f0]\335J\351\327A\343\215\f\244x\35\227\362!\253u\225\222xSQ\312%f\202G\270"...,
 3132) = 3132
getpid()= 1540056
select(11, [3 4 5 6 9], [3], NULL, {tv_sec=300, tv_usec=0}) = 1 (out [3], left 
{tv_sec=299, tv_usec=98})
write(3, 
"\0\0\0\20O\32\347\372\227'\177\6\340\353\314\317X\10=\245\323\266nC]\341\272\376",
 28) = 28
select(11, [3 4 5 6 9], [], NULL, {tv_sec=300, tv_usec=0}) = 1 (in [3], left 
{tv_sec=299, tv_usec=999404})
read(3, 
"\0\0\0\20\346Ra\334\n\354\252\304\230\340\3\270\177\260;\275\373\353\221\"\7\234\270\221",
 8192) = 28
select(11, [3 4 5 6 9], [], NULL, {tv_sec=300, tv_usec=0}) = 1 (in [3], left 
{tv_sec=299, tv_usec=997529})
read(3, 
"\0\0\0\20\370u2]) = 10
fcntl(10, F_SETFD, FD_CLOEXEC)  = 0
fcntl(10, F_GETFL)  = 0x2 (flags O_RDWR)
fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
fcntl(10, F_GETFL)  = 0x802 (flags O_RDWR|O_NONBLOCK)
getsockname(10, {sa_family=AF_UNIX, 
sun_path="/run/qemu-server/12341234_nbd.migrate"}, [128->40]) = 0
getsockname(10, {sa_family=AF_UNIX, 
sun_path="/run/qemu-server/12341234_nbd.migrate"}, [128->40]) = 0
getpeername(10, {sa_family=AF_UNIX}, [128->2]) = 0
getpeername(10, {sa_family=AF_UNIX}, [128->2]) = 0
getpid()= 1540056
write(3, 
"\0\0\0\20\371\31\34\304\377\210\22\213\356~\323h\367\217\3250ZK\354\31\21@o\257\0\0\0\20"...,
 180) = 180
select(11, [3 4 5 6], [], NULL, {tv_sec=300, tv_usec=0}) = 1 (in [3], left 
{tv_sec=299, tv_usec=999600})
read(3, 
"\0\0\\226:m8\370!\251\32\224\274\v\375\267\221-\226`\277\271\206\350\313\324\312\252?\260\364"...,
 8192) = 88
write(2, "channel 4: 

[pve-devel] [PATCH qemu-server 3/4] move finish_tunnel to after the VM is stopped

2020-03-11 Thread Mira Limbeck
The NBD socket on the client side is kept open until the VM stops. In
the case of TCP this was not a problem, because we did not rely on it
closing. But with unix socket forwarding via SSH the SSH connection
can't close if the socket is still open.

If it does not close by itself, it will be terminated after 30 seconds.

Signed-off-by: Mira Limbeck 
---
 PVE/QemuMigrate.pm | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 44e4c57..e569a2c 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -1035,15 +1035,6 @@ sub phase3_cleanup {
}
 }
 
-# close tunnel on successful migration, on error phase2_cleanup closed it
-if ($tunnel) {
-   eval { finish_tunnel($self, $tunnel);  };
-   if (my $err = $@) {
-   $self->log('err', $err);
-   $self->{errors} = 1;
-   }
-}
-
 eval {
my $timer = 0;
if (PVE::QemuServer::vga_conf_has_spice($conf->{vga}) && 
$self->{running}) {
@@ -1065,6 +1056,15 @@ sub phase3_cleanup {
$self->{errors} = 1;
 }
 
+# close tunnel on successful migration, on error phase2_cleanup closed it
+if ($tunnel) {
+   eval { finish_tunnel($self, $tunnel);  };
+   if (my $err = $@) {
+   $self->log('err', $err);
+   $self->{errors} = 1;
+   }
+}
+
 # always deactivate volumes - avoid lvm LVs to be active on several nodes
 eval {
my $vollist = PVE::QemuServer::get_vm_volumes($conf);
-- 
2.20.1


___
pve-devel mailing list
pve-devel@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel