On Thu, Jan 24, 2019 at 09:11:15AM +0000, Dr. David Alan Gilbert wrote: > * Jason Wang (jasow...@redhat.com) wrote: > > > > On 2019/1/24 上午3:53, Dr. David Alan Gilbert wrote: > > > * Jason Wang (jasow...@redhat.com) wrote: > > > > On 2019/1/22 上午2:56, Peter Maydell wrote: > > > > > On Thu, 17 Jan 2019 at 09:46, Jason Wang<jasow...@redhat.com> wrote: > > > > > > On 2019/1/15 上午12:33, Zhang Chen wrote: > > > > > > > On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert > > > > > > > <dgilb...@redhat.com <mailto:dgilb...@redhat.com>> wrote: > > > > > > > > > > > > > > * Peter Maydell (peter.mayd...@linaro.org > > > > > > > <mailto:peter.mayd...@linaro.org>) wrote: > > > > > > > > Recently I've noticed that test-filter-mirror has been > > > > > > > hanging > > > > > > > > intermittently, typically when run on some other TCG > > > > > > > architecture. > > > > > > > > In the instance I've just looked at, this was with s390x > > > > > > > guest on > > > > > > > > x86-64 host, though I've also seen it on other host archs > > > > > > > and > > > > > > > > perhaps with other guests. > > > > > > > > > > > > > > Watch out to see if you really do see it for other guests; > > > > > > > it carefully avoids using virtio-net to avoid vhost; but on > > > > > > > s390x it > > > > > > > uses virtio-net-ccw - could that hit the vhost it was > > > > > > > trying to avoid? > > > > > > > > > > > > > > > Below is a backtrace, though it seems to be pretty > > > > > > > unhelpful. > > > > > > > > Anybody got any theories ? Does the mirror test rely on > > > > > > > dirty > > > > > > > > memory bitmaps like the migration test (which also hangs > > > > > > > > occasionally with TCG due to some bug I'm sure we've > > > > > > > investigated > > > > > > > > in the past) ? > > > > > > > > > > > > > > I don't think it relies on the CPU at all. > > > > > > > I have no idea about this currently, but Jason and I designed > > > > > > > the > > > > > > > test case. > > > > > > > Add Jason: Have any comments about this ? > > > > > > I can't reproduce this locally with s390x-softmmu. It looks to me > > > > > > the > > > > > > test should be independent to any kinds of emulation. It should pass > > > > > > when mainloop work. > > > > > I've just seen a hang with ppc64 guest on s390x host, so it is > > > > > indeed not specific to s390x guest (and so not specific to > > > > > virtio-net either, since the ppc64 guest setup uses e1000). > > > > > > > > > > thanks > > > > > -- PMM > > > > Finally reproduced locally after hundreds (sometimes thousands) times of > > > > running. > > > > > > > > Bisection points to OOB monitor[1]. > > > > > > > > It looks to me after OOB is used unconditionally we lose a barrier to > > > > make > > > > sure socket is connected before sending packets in > > > > test-filter-mirror.c. Is > > > > there any other similar and simple thing that we could do to kick the > > > > mainloop? > > > Do you mean the: > > > > > > /* send a qmp command to guarantee that 'connected' is setting to > > > true. */ > > > qmp_discard_response(qts, "{ 'execute' : 'query-status'}"); > > > > > > Yes. > > > > > > > > > > why was that ever sufficient to know the socket was ready? > > > > > > It was suggested by Fam, I don't remember the details. Can we make sure all > > pending events has been processed (UNIX socket was set to connected) after > > query-status is returned with an non OOB monitor? > > I'm not sure - it doesn't sound like a 'query-status' should ensure > anything else. > How about something like a 'query-chardev' - can that tell you what you > need and loop until it's ready?
Yeah it sounds hacky to use "query status" to make sure a specific chardev is connected even before the OOB... I saw that currently the chardev requires "nowait": qts = qtest_initf( "-netdev socket,id=qtest-bn0,fd=%d " "-device %s,netdev=qtest-bn0,id=qtest-e0 " "-chardev socket,id=mirror0,path=%s,server,nowait " "-object filter-mirror,id=qtest-f0,netdev=qtest-bn0,queue=tx,outdev=mirror0 " , send_sock[1], devstr, sock_path); Could it work without "nowait"? Would that make sure QEMU will wait until connection established before going on? Regards, -- Peter Xu