Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Daniel P . Berrangé
On Mon, Aug 19, 2019 at 12:00:15PM +0100, Peter Maydell wrote:
> On Mon, 19 Aug 2019 at 11:50, Daniel P. Berrangé  wrote:
> > I don't think we want to expose this in the QAPI schema for the socket
> > address, since the correct value is really something that QEMU should
> > figure out based on usage context.
> >
> > Thus, I think we'll have to make it an explicit parameter to the
> > qio_channel_socket_listen_{sync,async} APIs, and socket_listen()
> > and inet_listen_saddr(), etc. Then the migration code can pass in
> > a sensible value based on multifd usage.
> 
> How bad would it be if we just passed SOMAXCONN for the backlog
> value always?

I'm not 100% clear to be honest, but my feeling is that this will waste
resources for much of QEMU usage where we only ever accept a single
client connection on an incoming socket.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Peter Maydell
On Mon, 19 Aug 2019 at 11:50, Daniel P. Berrangé  wrote:
> I don't think we want to expose this in the QAPI schema for the socket
> address, since the correct value is really something that QEMU should
> figure out based on usage context.
>
> Thus, I think we'll have to make it an explicit parameter to the
> qio_channel_socket_listen_{sync,async} APIs, and socket_listen()
> and inet_listen_saddr(), etc. Then the migration code can pass in
> a sensible value based on multifd usage.

How bad would it be if we just passed SOMAXCONN for the backlog
value always?

thanks
-- PMM



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Juan Quintela
Daniel P. Berrangé  wrote:
> On Mon, Aug 19, 2019 at 12:46:20PM +0200, Juan Quintela wrote:
>> Daniel P. Berrangé  wrote:
>> > On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
>> >> When we have lots of channels, sometimes multifd migration fails
>> >> with the following error:
>> >> 
>> >> Any good ideas?
>> >
>> > In inet_listen_saddr() we call
>> >
>> > if (!listen(slisten, 1)) {
>> >
>> > note the second parameter sets the socket backlog, which is the max
>> > number of pending socket connections we allow. My guess is that the
>> > target QEMU is not accepting incoming connections quickly enough and
>> > thus you hit the limit & the kernel starts dropping the incoming
>> > connections.
>> >
>> > As a quick test, just hack this code to pass a value of 100 and see
>> > if it makes your test reliable. If it does, then we'll need to figure
>> > out a nice way to handle backlog instead of hardcoding it at 1.
>> 
>> Nice.
>> 
>> With this change I can create 100 channels on a 4 core machine without
>> any trouble.
>> 
>> How can we proceed from here?
>
> I don't think we want to expose this in the QAPI schema for the socket
> address, since the correct value is really something that QEMU should
> figure out based on usage context.
>
> Thus, I think we'll have to make it an explicit parameter to the
> qio_channel_socket_listen_{sync,async} APIs, and socket_listen()
> and inet_listen_saddr(), etc. Then the migration code can pass in
> a sensible value based on multifd usage.

ok with me.  I will give it a try.

Thanks for the tip.

Later, Juan.



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Daniel P . Berrangé
On Mon, Aug 19, 2019 at 12:46:20PM +0200, Juan Quintela wrote:
> Daniel P. Berrangé  wrote:
> > On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
> >> When we have lots of channels, sometimes multifd migration fails
> >> with the following error:
> >> 
> >> Any good ideas?
> >
> > In inet_listen_saddr() we call
> >
> > if (!listen(slisten, 1)) {
> >
> > note the second parameter sets the socket backlog, which is the max
> > number of pending socket connections we allow. My guess is that the
> > target QEMU is not accepting incoming connections quickly enough and
> > thus you hit the limit & the kernel starts dropping the incoming
> > connections.
> >
> > As a quick test, just hack this code to pass a value of 100 and see
> > if it makes your test reliable. If it does, then we'll need to figure
> > out a nice way to handle backlog instead of hardcoding it at 1.
> 
> Nice.
> 
> With this change I can create 100 channels on a 4 core machine without
> any trouble.
> 
> How can we proceed from here?

I don't think we want to expose this in the QAPI schema for the socket
address, since the correct value is really something that QEMU should
figure out based on usage context.

Thus, I think we'll have to make it an explicit parameter to the
qio_channel_socket_listen_{sync,async} APIs, and socket_listen()
and inet_listen_saddr(), etc. Then the migration code can pass in
a sensible value based on multifd usage.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Juan Quintela
Daniel P. Berrangé  wrote:
> On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
>> When we have lots of channels, sometimes multifd migration fails
>> with the following error:
>> 
>> Any good ideas?
>
> In inet_listen_saddr() we call
>
> if (!listen(slisten, 1)) {
>
> note the second parameter sets the socket backlog, which is the max
> number of pending socket connections we allow. My guess is that the
> target QEMU is not accepting incoming connections quickly enough and
> thus you hit the limit & the kernel starts dropping the incoming
> connections.
>
> As a quick test, just hack this code to pass a value of 100 and see
> if it makes your test reliable. If it does, then we'll need to figure
> out a nice way to handle backlog instead of hardcoding it at 1.

Nice.

With this change I can create 100 channels on a 4 core machine without
any trouble.

How can we proceed from here?

Thanks, Juan.



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Juan Quintela
Daniel P. Berrangé  wrote:
> On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
>> When we have lots of channels, sometimes multifd migration fails
>> with the following error:
>> 
>> Any good ideas?
>
> In inet_listen_saddr() we call
>
> if (!listen(slisten, 1)) {
>
> note the second parameter sets the socket backlog, which is the max
> number of pending socket connections we allow. My guess is that the
> target QEMU is not accepting incoming connections quickly enough and
> thus you hit the limit & the kernel starts dropping the incoming
> connections.
>
> As a quick test, just hack this code to pass a value of 100 and see
> if it makes your test reliable. If it does, then we'll need to figure
> out a nice way to handle backlog instead of hardcoding it at 1.

Nice.

With this change I can create 100 channels on a 4 core machine without
any trouble (5 tries so far).  Previously with 10 channels on the same
machine failed around 50% of the time, and 100 channels failed 100% of
the time.

How can we proceed from here?

Thanks, Juan.



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Daniel P . Berrangé
On Mon, Aug 19, 2019 at 12:33:45PM +0200, Juan Quintela wrote:
> Daniel P. Berrangé  wrote:
> > On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
> >> When we have lots of channels, sometimes multifd migration fails
> >> with the following error:
> >>   after some time, sending side decides to send another packet through
> >>   that channel, and it is now when we get the above error.
> >> 
> >> Any good ideas?
> >
> > In inet_listen_saddr() we call
> >
> > if (!listen(slisten, 1)) {
> >
> > note the second parameter sets the socket backlog, which is the max
> > number of pending socket connections we allow. My guess is that the
> > target QEMU is not accepting incoming connections quickly enough and
> > thus you hit the limit & the kernel starts dropping the incoming
> > connections.
> >
> > As a quick test, just hack this code to pass a value of 100 and see
> > if it makes your test reliable. If it does, then we'll need to figure
> > out a nice way to handle backlog instead of hardcoding it at 1.
> 
> I will test.
> 
> But notice that the qemu_connect() on source side says that things went
> right.  It is the destination what is *not* calling the callback.  Or
> at least that is what I think it is happening.

IIRC, the connect() can succeed on the source host, even if the target host
has not called accept(), because the kernel will complete the connection at
the protocol level regardless. IOW, don't assume that the destination QEMU
has seen the connection. If you turn on tracing, we have a trace point
"qio_channel_socket_accept_complete" that is emitted after we have done
an accept(). So you can see if you get 100 of those trace points emitted
on the target, when you make 100 connects on the source.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Juan Quintela
Daniel P. Berrangé  wrote:
> On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
>> When we have lots of channels, sometimes multifd migration fails
>> with the following error:
>>   after some time, sending side decides to send another packet through
>>   that channel, and it is now when we get the above error.
>> 
>> Any good ideas?
>
> In inet_listen_saddr() we call
>
> if (!listen(slisten, 1)) {
>
> note the second parameter sets the socket backlog, which is the max
> number of pending socket connections we allow. My guess is that the
> target QEMU is not accepting incoming connections quickly enough and
> thus you hit the limit & the kernel starts dropping the incoming
> connections.
>
> As a quick test, just hack this code to pass a value of 100 and see
> if it makes your test reliable. If it does, then we'll need to figure
> out a nice way to handle backlog instead of hardcoding it at 1.

I will test.

But notice that the qemu_connect() on source side says that things went
right.  It is the destination what is *not* calling the callback.  Or
at least that is what I think it is happening.

Later, Juan.



Re: [Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-19 Thread Daniel P . Berrangé
On Wed, Aug 14, 2019 at 04:02:18AM +0200, Juan Quintela wrote:
> When we have lots of channels, sometimes multifd migration fails
> with the following error:
> 
> (qemu) migrate -d tcp:0:
> (qemu) qemu-system-x86_64: multifd_send_pages: channel 17 has already quit!
> qemu-system-x86_64: multifd_send_pages: channel 17 has already quit!
> qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
> qemu-system-x86_64: Unable to write to socket: Connection reset by peer
> info migrate
> globals:
> store-global-state: on
> only-migratable: off
> send-configuration: on
> send-section-footer: on
> decompress-error-check: on
> clear-bitmap-shift: 18
> capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: 
> off compress: off events: off postcopy-ram: off x-colo: off release-ram: off 
> block: off return-path: off pause-before-switchover: off multifd: on 
> dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off 
> x-ignore-shared: off
> Migration status: failed (Unable to write to socket: Connection reset by peer)
> total time: 0 milliseconds
> 
> On this particular example I am using 100 channels.  The bigger the
> number of channels, the easier that it is to reproduce.  That don't
> mean that it is a good idea to use so many channels.
> 
> With the previous patches on this series, I can run "reliabely" on my
> hardware with until 10 channels.  Most of the time.  Until it fails.
> With 100 channels, it fails almost always.
> 
> I thought that the problem was on the send side, so I tried to debug
> there.  As you can see for the delay, if you put any
> printf()/error_report/trace, you can get that the error goes away, it
> is very timing sensitive.  With a delay of 1 microseconds, it only
> works sometimes.
> 
> What have I discovered so far:
> 
> - send side calls qemu_socket() on all the channels.  So it appears
>   that it gets created correctly.
> - on the destination side, it appears that "somehowe" some of the
>   connections are lost by the listener.  This error happens when the
>   destination side socket hasn't been "accepted", and it is not
>   properly created.  As far as I can see, we have several options:
> 
>   1- I don't know how to use properly qio asynchronously
>  (this is one big posiblity).
> 
>   2- glib has one error in this case?  or how qio listener is
>  implemented on top of glib.  I put lots of printf() and other
>  instrumentation, and it appears that the listener io_func is not
>  called at all for the connections that are missing.
> 
>   3- it is always possible that we are missing some g_main_loop_run()
>  somewhere.  Notice how test/test-io-channel-socket.c calls it
>  "creatively".
> 
>   4- It is enterely possible that I should be using the sockets as
>  blocking instead of non-blocking.  But I am not sure about that
>  one yet.
> 
> - on the sending side, what happens is:
> 
>   eventually it call socket_connect() after all the async dance with
>   thread creation, etc, etc. Source side creates all the channels, it
>   is the destination side which is missing some of them.
> 
>   sending side sends the first packet by that channel, it "sucheeds"
>   and didn't give any error.
> 
>   after some time, sending side decides to send another packet through
>   that channel, and it is now when we get the above error.
> 
> Any good ideas?

In inet_listen_saddr() we call

if (!listen(slisten, 1)) {

note the second parameter sets the socket backlog, which is the max
number of pending socket connections we allow. My guess is that the
target QEMU is not accepting incoming connections quickly enough and
thus you hit the limit & the kernel starts dropping the incoming
connections.

As a quick test, just hack this code to pass a value of 100 and see
if it makes your test reliable. If it does, then we'll need to figure
out a nice way to handle backlog instead of hardcoding it at 1.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-devel] [PATCH 6/6] RFH: We lost "connect" events

2019-08-13 Thread Juan Quintela
When we have lots of channels, sometimes multifd migration fails
with the following error:

(qemu) migrate -d tcp:0:
(qemu) qemu-system-x86_64: multifd_send_pages: channel 17 has already quit!
qemu-system-x86_64: multifd_send_pages: channel 17 has already quit!
qemu-system-x86_64: multifd_send_sync_main: multifd_send_pages fail
qemu-system-x86_64: Unable to write to socket: Connection reset by peer
info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
clear-bitmap-shift: 18
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
compress: off events: off postcopy-ram: off x-colo: off release-ram: off block: 
off return-path: off pause-before-switchover: off multifd: on dirty-bitmaps: 
off postcopy-blocktime: off late-block-activate: off x-ignore-shared: off
Migration status: failed (Unable to write to socket: Connection reset by peer)
total time: 0 milliseconds

On this particular example I am using 100 channels.  The bigger the
number of channels, the easier that it is to reproduce.  That don't
mean that it is a good idea to use so many channels.

With the previous patches on this series, I can run "reliabely" on my
hardware with until 10 channels.  Most of the time.  Until it fails.
With 100 channels, it fails almost always.

I thought that the problem was on the send side, so I tried to debug
there.  As you can see for the delay, if you put any
printf()/error_report/trace, you can get that the error goes away, it
is very timing sensitive.  With a delay of 1 microseconds, it only
works sometimes.

What have I discovered so far:

- send side calls qemu_socket() on all the channels.  So it appears
  that it gets created correctly.
- on the destination side, it appears that "somehowe" some of the
  connections are lost by the listener.  This error happens when the
  destination side socket hasn't been "accepted", and it is not
  properly created.  As far as I can see, we have several options:

  1- I don't know how to use properly qio asynchronously
 (this is one big posiblity).

  2- glib has one error in this case?  or how qio listener is
 implemented on top of glib.  I put lots of printf() and other
 instrumentation, and it appears that the listener io_func is not
 called at all for the connections that are missing.

  3- it is always possible that we are missing some g_main_loop_run()
 somewhere.  Notice how test/test-io-channel-socket.c calls it
 "creatively".

  4- It is enterely possible that I should be using the sockets as
 blocking instead of non-blocking.  But I am not sure about that
 one yet.

- on the sending side, what happens is:

  eventually it call socket_connect() after all the async dance with
  thread creation, etc, etc. Source side creates all the channels, it
  is the destination side which is missing some of them.

  sending side sends the first packet by that channel, it "sucheeds"
  and didn't give any error.

  after some time, sending side decides to send another packet through
  that channel, and it is now when we get the above error.

Any good ideas?

Later, Juan.

PD: Command line used is attached:

Imortant bits:
- multifd is set
- multifd_channels is set to 100

/scratch/qemu/fail/x64/x86_64-softmmu/qemu-system-x86_64 -M
pc-i440fx-3.1,accel=kvm,usb=off,vmport=off,nvdimm -L
/mnt/code/qemu/check/pc-bios/ -smp 2 -name t1,debug-threads=on -m 3G
-uuid 113100f9-6c99-4a7a-9b78-eb1c088d1087 -monitor stdio -boot
strict=on -drive
file=/mnt/images/test.img,format=qcow2,if=none,id=disk0 -device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=disk0,id=virtio-disk0,bootindex=1
-netdev tap,id=hostnet0,script=/etc/kvm-ifup,downscript= -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:9d:10:51,bus=pci.0,addr=0x3
-serial pty -parallel none -usb -device usb-tablet -k es -vga cirrus
--global migration.x-multifd=on --global
migration.multifd-channels=100 -trace events=/home/quintela/tmp/events

CC: Daniel P. Berrangé 

Signed-off-by: Juan Quintela 
---
 migration/ram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/ram.c b/migration/ram.c
index 25a211c3fb..50586304a0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1248,6 +1248,7 @@ int multifd_save_setup(void)
 p->packet = g_malloc0(p->packet_len);
 p->name = g_strdup_printf("multifdsend_%d", i);
 socket_send_channel_create(multifd_new_send_channel_async, p);
+usleep(10);
 }
 return 0;
 }
-- 
2.21.0