Several times over the last few days, I've lost network connectivity
from one of my guests. This has happened only during interactive
sessions in which I take an action resulting in a large screen update. I
have tried flood pinging (only with the default, small packet size), and
not been able to reproduce under those circumstances.
This guest is running a current RHEL5/CentOS kernel (2.6.18-53.1.13.el5)
with clocksource=acpi_pm on the command line. The host side of the
network is a tap device, which is joined to a bridge. Other VMs on the
bridge still have working bidirectional networking, and dmesg on the
host shows the relevant port on the bridge in forwarding state.
Rebooting the guest without shutting down the kvm instance does not
resolve the issue. Powering down the VM and starting a new kvm instance
*does* resolve the issue.
Within the guest, tcpdump sees both incoming and outgoing traffic;
however, on the host, only traffic going *to* the guest is visible;
traffic the guest attempts to send is not visible.
When attempting to send, none of the counters (RX packets/TX
packets/etc) increase on the emulated e1000 device within the guest; RX
bytes and TX bytes are both 0, and all the error counters are likewise
zeroed. The e1000 module can be reloaded without any visible errors.
Where should I start in attempting to debug this?
MAC Registers
-------------
0x00000: CTRL (Device control register) 0x00140240
Endian mode (buffers): little
Link reset: normal
Set link up: 1
Invert Loss-Of-Signal: no
Receive flow control: disabled
Transmit flow control: disabled
VLAN mode: disabled
Auto speed detect: disabled
Speed select: 1000Mb/s
Force speed: no
Force duplex: no
0x00008: STATUS (Device status register) 0x80080783
Duplex: full
Link up: link config
TBI mode: disabled
Link speed: 1000Mb/s
Bus type: PCI
Bus speed: 33MHz
Bus width: 32-bit
0x00100: RCTL (Receive control register) 0x00008002
Receiver: enabled
Store bad packets: disabled
Unicast promiscuous: disabled
Multicast promiscuous: disabled
Long packet: disabled
Descriptor minimum threshold size: 1/2
Broadcast accept mode: accept
VLAN filter: disabled
Cononical form indicator: disabled
Discard pause frames: filtered
Pass MAC control frames: don't pass
Receive buffer size: 2048
0x02808: RDLEN (Receive desc length) 0x00000000
0x02810: RDH (Receive desc head) 0x00000000
0x02818: RDT (Receive desc tail) 0x000000FE
0x02820: RDTR (Receive delay timer) 0x00000000
0x00400: TCTL (Transmit ctrl register) 0x0103F0FA
Transmitter: enabled
Pad short packets: enabled
Software XOFF Transmission: disabled
Re-transmit on late collision: enabled
0x03808: TDLEN (Transmit desc length) 0x00000000
0x03810: TDH (Transmit desc head) 0x0000000F
0x03818: TDT (Transmit desc tail) 0x0000000F
0x03820: TIDV (Transmit delay timer) 0x00000000
PHY type: M88