On Thu, Dec 26, 2013 at 09:13:31PM +1100, Alexey Kardashevskiy wrote: > On 12/25/2013 08:52 PM, Michael S. Tsirkin wrote: > > On Wed, Dec 25, 2013 at 12:36:12PM +1100, Alexey Kardashevskiy wrote: > >> On 12/25/2013 02:43 AM, Michael S. Tsirkin wrote: > >>> On Wed, Dec 25, 2013 at 01:15:29AM +1100, Alexey Kardashevskiy wrote: > >>>> On 12/24/2013 08:40 PM, Michael S. Tsirkin wrote: > >>>>> On Tue, Dec 24, 2013 at 02:09:07PM +1100, Alexey Kardashevskiy wrote: > >>>>>> On 12/24/2013 03:24 AM, Michael S. Tsirkin wrote: > >>>>>>> On Mon, Dec 23, 2013 at 02:01:13AM +1100, Alexey Kardashevskiy wrote: > >>>>>>>> On 12/23/2013 01:46 AM, Alexey Kardashevskiy wrote: > >>>>>>>>> On 12/22/2013 09:56 PM, Michael S. Tsirkin wrote: > >>>>>>>>>> On Sun, Dec 22, 2013 at 02:01:23AM +1100, Alexey Kardashevskiy > >>>>>>>>>> wrote: > >>>>>>>>>>> Hi! > >>>>>>>>>>> > >>>>>>>>>>> I am having a problem with virtio-net + vhost on POWER7 machine - > >>>>>>>>>>> it does > >>>>>>>>>>> not survive reboot of the guest. > >>>>>>>>>>> > >>>>>>>>>>> Steps to reproduce: > >>>>>>>>>>> 1. boot the guest > >>>>>>>>>>> 2. configure eth0 and do ping - everything works > >>>>>>>>>>> 3. reboot the guest (i.e. type "reboot") > >>>>>>>>>>> 4. when it is booted, eth0 can be configured but will not work at > >>>>>>>>>>> all. > >>>>>>>>>>> > >>>>>>>>>>> The test is: > >>>>>>>>>>> ifconfig eth0 172.20.1.2 up > >>>>>>>>>>> ping 172.20.1.23 > >>>>>>>>>>> > >>>>>>>>>>> If to run tcpdump on the host's "tap-id3" interface, it shows no > >>>>>>>>>>> trafic > >>>>>>>>>>> coming from the guest. If to compare how it works before and > >>>>>>>>>>> after reboot, > >>>>>>>>>>> I can see the guest doing an ARP request for 172.20.1.23 and > >>>>>>>>>>> receives the > >>>>>>>>>>> response and it does the same after reboot but the answer does > >>>>>>>>>>> not come. > >>>>>>>>>> > >>>>>>>>>> So you see the arp packet in guest but not in host? > >>>>>>>>> > >>>>>>>>> Yes. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> One thing to try is to boot debug kernel - where pr_debug is > >>>>>>>>>> enabled - then you might see some errors in the kernel log. > >>>>>>>>> > >>>>>>>>> Tried and added lot more debug printk myself, not clear at all what > >>>>>>>>> is > >>>>>>>>> happening there. > >>>>>>>>> > >>>>>>>>> One more hint - if I boot the guest and the guest does not bring > >>>>>>>>> eth0 up > >>>>>>>>> AND wait more than 200 seconds (and less than 210 seconds), then > >>>>>>>>> eth0 will > >>>>>>>>> not work at all. I.e. this script produces not-working-eth0: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ifconfig eth0 172.20.1.2 down > >>>>>>>>> sleep 210 > >>>>>>>>> ifconfig eth0 172.20.1.2 up > >>>>>>>>> ping 172.20.1.23 > >>>>>>>>> > >>>>>>>>> s/210/200/ - and it starts working. No reboot is required to > >>>>>>>>> reproduce. > >>>>>>>>> > >>>>>>>>> No "vhost" == always works. The only difference I can see here is > >>>>>>>>> vhost's > >>>>>>>>> thread which may get suspended if not used for a while after the > >>>>>>>>> start and > >>>>>>>>> does not wake up but this is almost a blind guess. > >>>>>>>> > >>>>>>>> > >>>>>>>> Yet another clue - this host kernel patch seems to help with the > >>>>>>>> guest > >>>>>>>> reboot but does not help with the initial 210 seconds delay: > >>>>>>>> > >>>>>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > >>>>>>>> index 69068e0..5e67650 100644 > >>>>>>>> --- a/drivers/vhost/vhost.c > >>>>>>>> +++ b/drivers/vhost/vhost.c > >>>>>>>> @@ -162,10 +162,10 @@ void vhost_work_queue(struct vhost_dev *dev, > >>>>>>>> struct > >>>>>>>> vhost_work *work) > >>>>>>>> list_add_tail(&work->node, &dev->work_list); > >>>>>>>> work->queue_seq++; > >>>>>>>> spin_unlock_irqrestore(&dev->work_lock, flags); > >>>>>>>> - wake_up_process(dev->worker); > >>>>>>>> } else { > >>>>>>>> spin_unlock_irqrestore(&dev->work_lock, flags); > >>>>>>>> } > >>>>>>>> + wake_up_process(dev->worker); > >>>>>>>> } > >>>>>>>> EXPORT_SYMBOL_GPL(vhost_work_queue); > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> Interesting. Some kind of race? A missing memory barrier somewhere? > >>>>>> > >>>>>> I do not see how. I boot the guest and just wait 210 seconds, nothing > >>>>>> happens to cause races. > >>>>>> > >>>>>> > >>>>>>> Since it's all around startup, > >>>>>>> you can try kicking the host eventfd in > >>>>>>> vhost_net_start. > >>>>>> > >>>>>> > >>>>>> How exactly? This did not help. Thanks. > >>>>>> > >>>>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > >>>>>> index 006576d..407ecf2 100644 > >>>>>> --- a/hw/net/vhost_net.c > >>>>>> +++ b/hw/net/vhost_net.c > >>>>>> @@ -229,6 +229,17 @@ int vhost_net_start(VirtIODevice *dev, > >>>>>> NetClientState > >>>>>> *ncs, > >>>>>> if (r < 0) { > >>>>>> goto err; > >>>>>> } > >>>>>> + > >>>>>> + VHostNetState *vn = tap_get_vhost_net(ncs[i].peer); > >>>>>> + struct vhost_vring_file file = { > >>>>>> + .index = i > >>>>>> + }; > >>>>>> + file.fd = > >>>>>> event_notifier_get_fd(virtio_queue_get_host_notifier(dev->vq)); > >>>>>> + r = ioctl(vn->dev.control, VHOST_SET_VRING_KICK, &file); > >>>>> > >>>>> No, this sets the notifier, it does not kick. > >>>>> To kick you write 1 there: > >>>>> uint6_t v = 1; > >>>>> write(fd, &v, sizeof v); > >>>> > >>>> > >>>> Please, be precise. How/where do I get that @fd? Is what I do correct? > >>> > >>> Yes. > >>> > >>>> What > >>>> is uint6_t - uint8_t or uint16_t (neither works)? > >>> > >>> Sorry, should have been uint64_t. > >> > >> > >> Oh, that I missed :-) Anyway, this does not make any difference. Is there > >> any cheap&dirty way to make vhost-net kernel thread always awake? Sending > >> it signals from the user space does not work... > > > > You can run a timer in qemu and signal the eventfd from there > > periodically. > > > > Just to restate, tcpdump in guest shows that guest sends arp packet, > > but tcpdump in host on tun device does not show any packets? > > > Ok. Figured it out about disabling interfaces in Fedora19. I was wrong, > something is happening on the host's TAP - the guest sends ARP request, the > response is visible on the TAP interface but not in the guest.
Okay. So problem is on host to guest path then. Things to try: 1. trace handle_rx [vhost_net] 2. trace tun_put_user [tun] 3. I suspect some host bug in one of the features. Let's try to disable some flags with device property: you can get the list by doing: ./x86_64-softmmu/qemu-system-x86_64 -device virtio-net-pci,?|grep on/off Things I would try turning off is guest offloads (ones that start with guest_) event_idx,any_layout,mq. Turn them all off, if it helps try to find the one that helped. -- MST