Bug#592187: Bug#576838: virtio network crashes again

2010-08-15 Thread Lukas Kolbe
Hi Ben, Greg,

I was finally able to identify the patch series that introduced the fix
(they were introduced to -stable in 2.6.33.2):

cb63112 net: add __must_check to sk_add_backlog
a12a9a2 net: backlog functions rename
51c5db4 x25: use limited socket backlog
c531ab2 tipc: use limited socket backlog
37d60aa sctp: use limited socket backlog
9b3d968 llc: use limited socket backlog
230401e udp: use limited socket backlog
20a92ec tcp: use limited socket backlog
ab9dd05 net: add limit for socket backlog

After applying these to 2.6.32.17, I wasn't able to trigger the failure
anymore.

230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
so there might be some additional work needed.

@Greg: would it be possible to have these fixes in the next 2.6.32? See
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
they fix a guest network crash during heavy nfs-io using virtio.

Kind regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1281857854.2475.80.ca...@larosa.fritz.box



Bug#592187: Bug#576838: virtio network crashes again

2010-08-11 Thread Lukas Kolbe
Am Mittwoch, den 11.08.2010, 04:13 +0100 schrieb Ben Hutchings:
 On Mon, 2010-08-09 at 11:24 +0200, Lukas Kolbe wrote:
  So, testing begins.
  
  First conclusion: not all traffic patterns produce the page allocation
  failure. rdiff-backup only writing to an nfs-share does no harm;
  rdiff-backup reading and writing (incremental backup) leads to (nearly
  immediate) error.
  
  The nfs-share is always mounted with proto=tcp and nfsv3; /proc/mount says:
  fileserver.backup...:/export/backup/lbork /.cbackup-mp nfs 
  rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=65535,timeo=600,retrans=2,sec=sys,mountport=65535,addr=x.x.x.x
   0 0
 [...]
 
 I've seen some recent discussion of a bug in the Linux NFS client that
 can cause it to stop working entirely in case of some packet loss events
 https://bugzilla.kernel.org/show_bug.cgi?id=16494.  It is possible
 that you are running into that bug.  I haven't yet seen an agreement on
 the fix for it.

Thanks, I'll look into it. I ran some further tests with vanilla and
debian kernels:

VERSION WORKING
---
2.6.35  yes
2.6.33.6yes
2.6.32.17   doesn't boot as kvm guest
2.6.32.17-2.6.32-19 no
2.6.32.17-2.6.32-18 no
2.6.32.16   no

I don't know if this is related to #16494 since I'm unable to trigger it
on 2.6.33.6 or 2.6.35. I'll test 2.6.32 with the patch from
http://lkml.org/lkml/2010/8/10/52 applied as well and bisect between
2.6.32.17 and 2.6.33.6 in the next few days.

 I also wonder whether the extremely large request sizes (rsize and
 wsize) you have selected are more likely to trigger the allocation
 failure in virtio_net.  Please can you test whether reducing them helps?

The large rsize/wsize were automatically chosen, but I'll test with a
failing kernel and [rw]size of 32768.

Kind regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1281518672.11319.146.ca...@larosa.fritz.box



Bug#592187: Bug#576838: virtio network crashes again

2010-08-10 Thread Ben Hutchings
On Mon, 2010-08-09 at 11:24 +0200, Lukas Kolbe wrote:
 So, testing begins.
 
 First conclusion: not all traffic patterns produce the page allocation
 failure. rdiff-backup only writing to an nfs-share does no harm;
 rdiff-backup reading and writing (incremental backup) leads to (nearly
 immediate) error.
 
 The nfs-share is always mounted with proto=tcp and nfsv3; /proc/mount says:
 fileserver.backup...:/export/backup/lbork /.cbackup-mp nfs 
 rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=65535,timeo=600,retrans=2,sec=sys,mountport=65535,addr=x.x.x.x
  0 0
[...]

I've seen some recent discussion of a bug in the Linux NFS client that
can cause it to stop working entirely in case of some packet loss events
https://bugzilla.kernel.org/show_bug.cgi?id=16494.  It is possible
that you are running into that bug.  I haven't yet seen an agreement on
the fix for it.

I also wonder whether the extremely large request sizes (rsize and
wsize) you have selected are more likely to trigger the allocation
failure in virtio_net.  Please can you test whether reducing them helps?

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Bug#592187: Bug#576838: virtio network crashes again

2010-08-09 Thread Lukas Kolbe
Hi Ben,

Am Sonntag, den 08.08.2010, 03:36 +0100 schrieb Ben Hutchings:
 This is not the same bug as was originally reported, which is that
 virtio_net failed to retry refilling its RX buffer ring.  That is
 definitely fixed.  So I'm treating this as a new bug report, #592187.

Okay, thanks. 

   I think you need to give your guests more memory.
  
  They all have between 512M and 2G - and it happens to all of them using
  virtio_net, and none of them using rtl8139 as a network driver,
  reproducibly.
 
 The RTL8139 hardware uses a single fixed RX DMA buffer.  The virtio
 'hardware' allows the host to write into RX buffers anywhere in guest
 memory.  This results in very different allocation patterns.
 
 Please try specifying 'e1000' hardware, i.e. an Intel gigabit
 controller.  I think the e1000 driver will have a similar allocation
 pattern to virtio_net, so you can see whether it also triggers
 allocation failures and a network stall in the guest.
 
 Also, please test Linux 2.6.35 in the guest.  This is packaged in the
 'experimental' suite.

I'll rig up a test machine (the crashes all occured on production
guests, unfortunatly) and report back. 

 [...]
  If it would be an OOM situation, wouldn't the OOM-killer be supposed to
  kick in?
 [...]
 
 The log you sent shows failure to allocate memory in an 'atomic' context
 where there is no opportunity to wait for pages to be swapped out.  The
 OOM killer isn't triggered until the system is running out of memory
 despite swapping out pages.

Ah, good to know, thanks!

 Also, I note that following the failure of virtio_net to refill its RX
 buffer ring, I see failures to allocate buffers for sending TCP ACKs.
 So the guest drops the ACKs, and that TCP connection will stall
 temporarily (until the peer re-sends the unacknowledged packets).
 
 I also see 'nfs: server fileserver.backup.TechFak.Uni-Bielefeld.DE not
 responding, still trying'.  This suggests that the allocation failure in
 virtio_net has resulted in dropping packets from the NFS server.  And it
 just makes matters worse as it becomes impossible to free memory by
 flushing out buffers over NFS!

This sounds quite bad. 

This problem *seems* to be fixed by 2.6.32-19: we upgraded to that on a
different machine for host and guests, and an rsync of ~1TiB of data
didn't produce any page allocation failures using virtio. But I'd wait
for my tests with rsync/nfs and 2.6.32-18+e1000, 2.6.32-18+virtio
2.6.32-19+virtio and 2.6.35+virtio to conclude that.

Thanks for taking your time to explain things!

-- 
Lukas





-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1281338417.11319.20.ca...@larosa.fritz.box



Bug#592187: Bug#576838: virtio network crashes again

2010-08-09 Thread Lukas Kolbe
So, testing begins.

First conclusion: not all traffic patterns produce the page allocation
failure. rdiff-backup only writing to an nfs-share does no harm;
rdiff-backup reading and writing (incremental backup) leads to (nearly
immediate) error.

The nfs-share is always mounted with proto=tcp and nfsv3; /proc/mount says:
fileserver.backup...:/export/backup/lbork /.cbackup-mp nfs 
rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=65535,timeo=600,retrans=2,sec=sys,mountport=65535,addr=x.x.x.x
 0 0


This is the result of 2.6.32-18 with virtio:

(/proc/meminfo within ten seconds of the page allocation failure, if
that helps)

MemTotal: 509072 kB
MemFree:   10356 kB
Buffers:4244 kB
Cached:   419996 kB
SwapCached:0 kB
Active:50856 kB
Inactive: 422424 kB
Active(anon):  24948 kB
Inactive(anon):25084 kB
Active(file):  25908 kB
Inactive(file):   397340 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal:   4194296 kB
SwapFree:4194296 kB
Dirty:  5056 kB
Writeback: 0 kB
AnonPages: 49080 kB
Mapped: 7868 kB
Shmem:   952 kB
Slab:  11736 kB
SReclaimable:   5604 kB
SUnreclaim: 6132 kB
KernelStack:1920 kB
PageTables: 3728 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit: 4448832 kB
Committed_AS:1419384 kB
VmallocTotal:   34359738367 kB
VmallocUsed:5536 kB
VmallocChunk:   34359728048 kB
HardwareCorrupted: 0 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:8180 kB
DirectMap2M:  516096 kB
[  170.625928] rdiff-backup.bi: page allocation failure. order:0, mode:0x20
[  170.625934] Pid: 2398, comm: rdiff-backup.bi Not tainted 2.6.32-5-amd64 #1
[  170.625935] Call Trace:
[  170.625937]  IRQ  [810b8b7f] ? __alloc_pages_nodemask+0x55b/0x5d0
[  170.625993]  [81245a6c] ? __alloc_skb+0x69/0x15a
[  170.626002]  [a01aee52] ? try_fill_recv+0x8b/0x18b [virtio_net]
[  170.626004]  [a01af8c6] ? virtnet_poll+0x543/0x5c9 [virtio_net]
[  170.626010]  [8124cb8b] ? net_rx_action+0xae/0x1c9
[  170.626032]  [81052735] ? __do_softirq+0xdd/0x1a0
[  170.626035]  [a01ae153] ? skb_recv_done+0x28/0x34 [virtio_net]
[  170.626044]  [81011cac] ? call_softirq+0x1c/0x30
[  170.626049]  [81013207] ? do_softirq+0x3f/0x7c
[  170.626051]  [810525a4] ? irq_exit+0x36/0x76
[  170.626053]  [810128fe] ? do_IRQ+0xa0/0xb6
[  170.626061]  [810114d3] ? ret_from_intr+0x0/0x11
[  170.626062]  EOI 
[  170.626063] Mem-Info:
[  170.626065] Node 0 DMA per-cpu:
[  170.626072] CPU0: hi:0, btch:   1 usd:   0
[  170.626073] CPU1: hi:0, btch:   1 usd:   0
[  170.626074] Node 0 DMA32 per-cpu:
[  170.626076] CPU0: hi:  186, btch:  31 usd:  30
[  170.626078] CPU1: hi:  186, btch:  31 usd: 181
[  170.626082] active_anon:6237 inactive_anon:6271 isolated_anon:0
[  170.626083]  active_file:6476 inactive_file:100535 isolated_file:32
[  170.626084]  unevictable:0 dirty:1008 writeback:0 unstable:2050
[  170.626084]  free:729 slab_reclaimable:1401 slab_unreclaimable:1762
[  170.626085]  mapped:1967 shmem:238 pagetables:932 bounce:0
[  170.626087] Node 0 DMA free:1980kB min:84kB low:104kB high:124kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:13856kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15372kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:32kB 
slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  170.626099] lowmem_reserve[]: 0 489 489 489
[  170.626101] Node 0 DMA32 free:936kB min:2784kB low:3480kB high:4176kB 
active_anon:24948kB inactive_anon:25084kB active_file:25904kB 
inactive_file:388284kB unevictable:0kB isolated(anon):0kB isolated(file):128kB 
present:500948kB mlocked:0kB dirty:4032kB writeback:0kB mapped:7868kB 
shmem:952kB slab_reclaimable:5572kB slab_unreclaimable:7040kB 
kernel_stack:1912kB pagetables:3728kB unstable:8200kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  170.626110] lowmem_reserve[]: 0 0 0 0
[  170.626112] Node 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 
1*512kB 1*1024kB 0*2048kB 0*4096kB = 1976kB
[  170.626118] Node 0 DMA32: 0*4kB 1*8kB 0*16kB 1*32kB 0*64kB 1*128kB 1*256kB 
1*512kB 0*1024kB 0*2048kB 0*4096kB = 936kB
[  170.626125] 107278 total pagecache pages
[  170.626126] 0 pages in swap cache
[  170.626127] Swap cache stats: add 0, delete 0, find 0/0
[  170.626128] Free swap  = 4194296kB
[  170.626130] Total swap = 4194296kB
[  170.631675] 131069 pages RAM
[  170.631677] 3801 pages reserved
[  170.631678] 23548 pages shared
[  170.631679] 113310 

Bug#592187: Bug#576838: virtio network crashes again

2010-08-09 Thread Lukas Kolbe
Okay, next round: This time, 2.6.32-19 and virtio in guest, 2.6.32-18 in
the host and sadly, it's not fixed:

[  159.772700] rdiff-backup.bi: page allocation failure. order:0, mode:0x20
[  159.772708] Pid: 2524, comm: rdiff-backup.bi Not tainted 2.6.32-5-amd64 #1
[  159.772710] Call Trace:
[  159.772712]  IRQ  [810b8b6f] ? __alloc_pages_nodemask+0x55b/0x5d0
[  159.772759]  [81245b2c] ? __alloc_skb+0x69/0x15a
[  159.772779]  [a0202e52] ? try_fill_recv+0x8b/0x18b [virtio_net]
[  159.772784]  [a02038c6] ? virtnet_poll+0x543/0x5c9 [virtio_net]
[  159.772799]  [8124cc4b] ? net_rx_action+0xae/0x1c9
[  159.772817]  [8105274d] ? __do_softirq+0xdd/0x1a0
[  159.772829]  [a0202153] ? skb_recv_done+0x28/0x34 [virtio_net]
[  159.772838]  [81011cac] ? call_softirq+0x1c/0x30
[  159.772843]  [81013207] ? do_softirq+0x3f/0x7c
[  159.772845]  [810525bc] ? irq_exit+0x36/0x76
[  159.772847]  [810128fe] ? do_IRQ+0xa0/0xb6
[  159.772850]  [810114d3] ? ret_from_intr+0x0/0x11
[  159.772851]  EOI  [81242beb] ? kmap_skb_frag+0x3/0x43
[  159.772856]  [81243b2d] ? skb_checksum+0xfa/0x23f
[  159.772858]  [8124726d] ? __skb_checksum_complete_head+0x15/0x55
[  159.772868]  [81282d4f] ? tcp_checksum_complete_user+0x1f/0x3c
[  159.772870]  [812835dd] ? tcp_rcv_established+0x3c5/0x6d9
[  159.772875]  [8128a87b] ? tcp_v4_do_rcv+0x1bb/0x376
[  159.772877]  [812876e8] ? tcp_write_xmit+0x883/0x96c
[  159.772880]  [81240ac1] ? release_sock+0x46/0x96
[  159.772882]  [8127ca05] ? tcp_sendmsg+0x78a/0x87e
[  159.772885]  [8123e515] ? sock_sendmsg+0xa3/0xbb
[  159.772894]  [8106386e] ? autoremove_wake_function+0x0/0x2e
[  159.772902]  [810c5b9c] ? zone_statistics+0x3c/0x5d
[  159.772906]  [8104085f] ? pick_next_task_fair+0xcd/0xd8
[  159.772919]  [8123e814] ? kernel_sendmsg+0x32/0x3f
[  159.772943]  [a02acca6] ? xs_send_kvec+0x78/0x7f [sunrpc]
[  159.772948]  [a02acd36] ? xs_sendpages+0x89/0x1a1 [sunrpc]
[  159.772953]  [a02acf43] ? xs_tcp_send_request+0x44/0x131 [sunrpc]
[  159.772961]  [a02ab263] ? xprt_transmit+0x17b/0x25a [sunrpc]
[  159.772996]  [f[  340.048248] serial8250: too much work for irq4
fffa033af51] ? nfs3_xdr_readargs+0x7a/0x89 [nfs]
[  159.773000]  [a02a8c14] ? call_transmit+0x1fb/0x246 [sunrpc]
[  159.773009]  [a02af2ab] ? __rpc_execute+0x7d/0x24d [sunrpc]
[  159.773032]  [a02a94d4] ? rpc_run_task+0x53/0x5b [sunrpc]
[  159.773042]  [a0334844] ? nfs_read_rpcsetup+0x1d2/0x1f4 [nfs]
[  159.773048]  [a03344b5] ? readpage_async_filler+0x0/0xbf [nfs]
[  159.773061]  [a0332c14] ? nfs_pageio_doio+0x2a/0x51 [nfs]
[  159.773067]  [a0332d00] ? nfs_pageio_add_request+0xc5/0xd5 [nfs]
[  159.773072]  [a0334532] ? readpage_async_filler+0x7d/0xbf [nfs]
[  159.773076]  [810ba59c] ? read_cache_pages+0x91/0x105
[  159.773082]  [a033430a] ? nfs_readpages+0x155/0x1b4 [nfs]
[  159.773087]  [a0334be0] ? nfs_pagein_one+0x0/0xd0 [nfs]
[  159.773092]  [81046ccf] ? finish_task_switch+0x3a/0xaf
[  159.773094]  [810ba0b9] ? __do_page_cache_readahead+0x11b/0x1b4
[  159.773097]  [810ba16e] ? ra_submit+0x1c/0x20
[  159.773099]  [810ba45d] ? page_cache_async_readahead+0x75/0xad
[  159.773109]  [810b3c82] ? generic_file_aio_read+0x23a/0x52b
[  159.773118]  [810ecdc9] ? do_sync_read+0xce/0x113
[  159.773124]  [8100f79c] ? __switch_to+0x285/0x297
[  159.773126]  [8106386e] ? autoremove_wake_function+0x0/0x2e
[  159.773129]  [81046ccf] ? finish_task_switch+0x3a/0xaf
[  159.773131]  [810ed812] ? vfs_read+0xa6/0xff
[  159.773133]  [810ed927] ? sys_read+0x45/0x6e
[  159.773136]  [81010b42] ? system_call_fastpath+0x16/0x1b
[  159.773138] Mem-Info:
[  159.773139] Node 0 DMA per-cpu:
[  159.773141] CPU0: hi:0, btch:   1 usd:   0
[  159.773143] CPU1: hi:0, btch:   1 usd:   0
[  159.773144] Node 0 DMA32 per-cpu:
[  159.773146] CPU0: hi:  186, btch:  31 usd: 184
[  159.773147] CPU1: hi:  186, btch:  31 usd:  39
[  159.773151] active_anon:5153 inactive_anon:2765 isolated_anon:0
[  159.773152]  active_file:17029 inactive_file:65343 isolated_file:0
[  159.773153]  unevictable:0 dirty:8266 writeback:0 unstable:443
[  159.773154]  free:787 slab_reclaimable:25621 slab_unreclaimable:3017
[  159.773154]  mapped:1946 shmem:238 pagetables:921 bounce:0
[  159.773156] Node 0 DMA free:1992kB min:84kB low:104kB high:124kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:3276kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15372kB 
mlocked:0kB dirty:1232kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:60kB kernel_stack:0kB pagetables:0kB 
unstable:0kB bounce:0kB writeback_tmp:0kB 

Bug#592187: Bug#576838: virtio network crashes again

2010-08-07 Thread Ben Hutchings
This is not the same bug as was originally reported, which is that
virtio_net failed to retry refilling its RX buffer ring.  That is
definitely fixed.  So I'm treating this as a new bug report, #592187.

On Sat, 2010-08-07 at 18:17 +0200, Lukas Kolbe wrote:
 Am Samstag, den 07.08.2010, 12:18 +0100 schrieb Ben Hutchings:
  On Sat, 2010-08-07 at 11:21 +0200, Lukas Kolbe wrote:
   Hi,
   
   I sent this earlier today but the bug was archived so it didn't appear
   anywhere, hence the resend.
   
   I believe this issue is not fixed at all in 2.6.32-18. We have seen this
   behaviour in various kvm guests using virtio_net with the same kernel in
   the guest only minutes after starting the nightly backup (rdiff-backup
   to an nfs-volume on a remote server), eventually leading to a
   non-functional network. Often, the machines even do not reboot and hang
   instead. Using the rtl8139 instead of virtio helps, but that's really
   only a clumsy workaround.
  [...]
  
  I think you need to give your guests more memory.
 
 They all have between 512M and 2G - and it happens to all of them using
 virtio_net, and none of them using rtl8139 as a network driver,
 reproducibly.

The RTL8139 hardware uses a single fixed RX DMA buffer.  The virtio
'hardware' allows the host to write into RX buffers anywhere in guest
memory.  This results in very different allocation patterns.

Please try specifying 'e1000' hardware, i.e. an Intel gigabit
controller.  I think the e1000 driver will have a similar allocation
pattern to virtio_net, so you can see whether it also triggers
allocation failures and a network stall in the guest.

Also, please test Linux 2.6.35 in the guest.  This is packaged in the
'experimental' suite.

[...]
 If it would be an OOM situation, wouldn't the OOM-killer be supposed to
 kick in?
[...]

The log you sent shows failure to allocate memory in an 'atomic' context
where there is no opportunity to wait for pages to be swapped out.  The
OOM killer isn't triggered until the system is running out of memory
despite swapping out pages.

Also, I note that following the failure of virtio_net to refill its RX
buffer ring, I see failures to allocate buffers for sending TCP ACKs.
So the guest drops the ACKs, and that TCP connection will stall
temporarily (until the peer re-sends the unacknowledged packets).

I also see 'nfs: server fileserver.backup.TechFak.Uni-Bielefeld.DE not
responding, still trying'.  This suggests that the allocation failure in
virtio_net has resulted in dropping packets from the NFS server.  And it
just makes matters worse as it becomes impossible to free memory by
flushing out buffers over NFS!

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part