Dear all, I recently found that this chageset: http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=e19eb22486f258a421108ac22b8380a4e2f16b97 "net: make use of async packet sending API in tap client" causes NFS root Linux guest setups using TAP networking to fail with error messages like: nfs: server 172.20.0.1 not responding, still trying nfs: server 172.20.0.1 OK < .... repeat infinitely ...> This happens on both the "master" and "stable-0.11" branches on qemu.
The attached '0001-net-revert-e19eb22486f258a421108ac22b8380a4e2f16b97.patch' makes NFS root on qemu emulated "arm-integrator-cp" boards work for me again. I've uploaded wireshark captures of qemu-0.10(good, nfsroot works) and qemu-0.11(bad) here: http://scottt.tw/bug/qemu-async-tap-drops-packets/qemu-nfsroot-good.pcap http://scottt.tw/bug/qemu-async-tap-drops-packets/qemu-nfsroot-bad.pcap Inspecting frame 268 in "qemu-nfsroot-bad.pcap", I see: "ICMP Fragment reassembly time exceeded", reply to request in frame 53, duplicate to the reply in frame 56 and suspect qemu is dropping ethernet frames from larger, fragmented IP packets used for NFS READ replies. After finding: http://lists.gnu.org/archive/html/qemu-devel/2009-09/msg01173.html through Google and reading through the potentially bad commits that Sven found through bisection, I patched "tap_send()" to not run in a loop ("drain the tap send queue in one go"?) and the error goes away. To reproduce this, download "zImage", "scripts/" and "trigger-bug" from: http://scottt.tw/bug/qemu-async-tap-drops-packets/ and run the "trigger-bug" script I've only just started reading linux/Documentation/networking/tuntap.txt after encountering this problem and currently find the code called from "tap_send()" ex: qemu_send_packet_async, qemu_deliver_packet and the semantics of their return values pretty confusing. I'm sure my patch should be refined to both make both NFS root and the originally intended optimization work.
From ffde3feb1922c1d5663ace9d5a32eacaa26491dd Mon Sep 17 00:00:00 2001 From: Scott Tsai <scottt...@gmail.com> Date: Sun, 25 Oct 2009 01:52:36 +0800 Subject: [PATCH] net: revert e19eb22486f258a421108ac22b8380a4e2f16b97 to fix nfsroot --- net.c | 19 ++++++++----------- 1 files changed, 8 insertions(+), 11 deletions(-) diff --git a/net.c b/net.c index 4708080..39dcd04 100644 --- a/net.c +++ b/net.c @@ -1357,17 +1357,14 @@ static void tap_send(void *opaque) TAPState *s = opaque; int size; - do { - size = tap_read_packet(s->fd, s->buf, sizeof(s->buf)); - if (size <= 0) { - break; - } - - size = qemu_send_packet_async(s->vc, s->buf, size, tap_send_completed); - if (size == 0) { - tap_read_poll(s, 0); - } - } while (size > 0); + size = tap_read_packet(s->fd, s->buf, sizeof(s->buf)); + if (size <= 0) { + return; + } + size = qemu_send_packet_async(s->vc, s->buf, size, tap_send_completed); + if (size == 0) { + tap_read_poll(s, 0); + } } #ifdef TUNSETSNDBUF -- 1.6.2.5