Hi

Still getting (some but less) network issues with a 2.6.28.9 host.

Found quite a few of these call traces in the 2.6.29.1 guests:
Guest has 512MB of memory and was not all that busy (just network
traffic), so I don't understand why it would fail to allocate a page...


[701453.834571] kjournald: page allocation failure. order:0, mode:0x4020
[701453.834574] Pid: 4806, comm: kjournald Not tainted 2.6.29.1 #4
[701453.834576] Call Trace:
[701453.834578]  <IRQ>  [<ffffffff8027fa48>]
__alloc_pages_internal+0x3e1/0x401
[701453.834586]  [<ffffffff802a1ad4>] __slab_alloc+0x17f/0x4ca
[701453.834590]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
[701453.834592]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
[701453.834595]  [<ffffffff802a2e66>] __kmalloc_track_caller+0xac/0xe1
[701453.834598]  [<ffffffff8062f97e>] __alloc_skb+0x61/0x11e
[701453.834600]  [<ffffffff8067e322>] tcp_send_ack+0x23/0x105
[701453.834603]  [<ffffffff8067c374>] tcp_rcv_established+0x6c7/0x9e6
[701453.834605]  [<ffffffff80683515>] tcp_v4_do_rcv+0x19e/0x324
[701453.834608]  [<ffffffff80683b23>] tcp_v4_rcv+0x488/0x73b
[701453.834611]  [<ffffffff806499c4>] nf_hook_slow+0x62/0xc3
[701453.834615]  [<ffffffff8066925c>] ip_local_deliver_finish+0x0/0x1ee
[701453.834617]  [<ffffffff80669378>] ip_local_deliver_finish+0x11c/0x1ee
[701453.834620]  [<ffffffff80668fcb>] ip_rcv_finish+0x2cf/0x2e9
[701453.834622]  [<ffffffff80669218>] ip_rcv+0x233/0x277
[701453.834626]  [<ffffffff8055d1e7>] virtnet_poll+0x4ca/0x5ab
[701453.834628]  [<ffffffff80633952>] net_rx_action+0x70/0x143
[701453.834631]  [<ffffffff8024030a>] __do_softirq+0x83/0x145
[701453.834634]  [<ffffffff8020eb7a>] timer_interrupt+0x1a/0x21
[701453.834637]  [<ffffffff8020d35c>] call_softirq+0x1c/0x28
[701453.834639]  [<ffffffff8020e2c0>] do_softirq+0x3c/0x85
[701453.834641]  [<ffffffff80240021>] irq_exit+0x3f/0x7a
[701453.834643]  [<ffffffff8020e59c>] do_IRQ+0x12b/0x14f
[701453.834646]  [<ffffffff8020cad3>] ret_from_intr+0x0/0x29
[701453.834647]  <EOI>  [<ffffffff80621b29>] vp_notify+0x0/0x1c
[701453.834653]  [<ffffffff804b099e>] __make_request+0x3e2/0x425
[701453.834656]  [<ffffffff804af1ff>] generic_make_request+0x338/0x389
[701453.834660]  [<ffffffff802986ce>] end_swap_bio_write+0x0/0x66
[701453.834664]  [<ffffffff802c6643>] bio_alloc_bioset+0x73/0xff
[701453.834666]  [<ffffffff804af30d>] submit_bio+0xbd/0xc4
[701453.834669]  [<ffffffff8072a52a>] _spin_lock+0x5/0x7
[701453.834672]  [<ffffffff802986c4>] swap_writepage+0x9b/0xa5
[701453.834675]  [<ffffffff80283dc1>] shrink_page_list+0x358/0x5ff
[701453.834677]  [<ffffffff80284319>] shrink_list+0x2b1/0x5d8
[701453.834680]  [<ffffffff802806e8>] determine_dirtyable_memory+0xd/0x1d
[701453.834682]  [<ffffffff8028075e>] get_dirty_limits+0x1d/0x24f
[701453.834685]  [<ffffffff802237c4>] pvclock_clocksource_read+0x3a/0x70
[701453.834688]  [<ffffffff802848bd>] shrink_zone+0x27d/0x325
[701453.834692]  [<ffffffff80231733>] resched_task+0x2a/0x75
[701453.834694]  [<ffffffff80280990>] background_writeout+0x0/0xce
[701453.834696]  [<ffffffff802855c7>] try_to_free_pages+0x1fa/0x32d
[701453.834699]  [<ffffffff80282bea>] isolate_pages_global+0x0/0x231
[701453.834701]  [<ffffffff8027f8c0>] __alloc_pages_internal+0x259/0x401
[701453.834705]  [<ffffffff8027aefb>] find_or_create_page+0x48/0x88
[701453.834707]  [<ffffffff802c2c31>] __getblk+0x117/0x29d
[701453.834711]  [<ffffffff80357f00>]
journal_get_descriptor_buffer+0x30/0x76
[701453.834713]  [<ffffffff8035478e>] journal_commit_transaction+0x6da/0xdf0
[701453.834716]  [<ffffffff80244218>] lock_timer_base+0x26/0x4b
[701453.834719]  [<ffffffff8024428f>] try_to_del_timer_sync+0x52/0x5b
[701453.834721]  [<ffffffff8072a484>] _spin_lock_irqsave+0x24/0x2c
[701453.834723]  [<ffffffff803578a0>] kjournald+0xe5/0x214
[701453.834726]  [<ffffffff8024d628>] autoremove_wake_function+0x0/0x2e
[701453.834729]  [<ffffffff803577bb>] kjournald+0x0/0x214
[701453.834731]  [<ffffffff8024d2c7>] kthread+0x47/0x73
[701453.834748]  [<ffffffff8020d25a>] child_rip+0xa/0x20
[701453.834751]  [<ffffffff8024d280>] kthread+0x0/0x73
[701453.834753]  [<ffffffff8020d250>] child_rip+0x0/0x20
[701453.834754] Mem-Info:
[701453.834755] DMA per-cpu:
[701453.834757] CPU    0: hi:    0, btch:   1 usd:   0
[701453.834758] DMA32 per-cpu:
[701453.834760] CPU    0: hi:  186, btch:  31 usd: 165
[701453.834763] Active_anon:674 active_file:43401 inactive_anon:11269
[701453.834764]  inactive_file:53885 unevictable:0 dirty:5182
writeback:70 unstable:0
[701453.834765]  free:749 slab:12132 mapped:9094 pagetables:840 bounce:0
[701453.834768] DMA free:1968kB min:28kB low:32kB high:40kB
active_anon:0kB inactive_anon:0kB active_file:1952kB
inactive_file:2380kB unevictable:0kB present:5440kB pages_scanned:0
all_unreclaimable? no
[701453.834770] lowmem_reserve[]: 0 489 489 489
[701453.834774] DMA32 free:1028kB min:2812kB low:3512kB high:4216kB
active_anon:2696kB inactive_anon:45076kB active_file:171652kB
inactive_file:213160kB unevictable:0kB present:500896kB pages_scanned:0
all_unreclaimable? no
[701453.834777] lowmem_reserve[]: 0 0 0 0
[701453.834779] DMA: 78*4kB 7*8kB 12*16kB 8*32kB 10*64kB 4*128kB 0*256kB
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1968kB
[701453.834785] DMA32: 1*4kB 0*8kB 0*16kB 0*32kB 10*64kB 3*128kB 0*256kB
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1028kB
[701453.834791] 99417 total pagecache pages
[701453.834793] 2101 pages in swap cache
[701453.834794] Swap cache stats: add 8718, delete 6617, find 110037/110217
[701453.834796] Free swap  = 1020652kB
[701453.834797] Total swap = 1048568kB
[701453.836985] 131056 pages RAM
[701453.836987] 4801 pages reserved
[701453.836988] 98664 pages shared
[701453.836990] 34608 pages non-shared




Antoine Martin wrote:
> Hi,
> 
> The bug report below does indeed match everything I have experienced.
> Upon further inspection, 2.6.28.9 is also affected, just less so.
> 
> Unfortunately I have applied the patch:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
> And if anything, it made things worse.
> 
> Cheers
> Antoine
> 
> 
> 
> Mark McLoughlin wrote:
>> On Sun, 2009-04-19 at 14:48 +0300, Avi Kivity wrote:
>>> Antoine Martin wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA512
>>>>
>>>> Wireshark was showing a huge amount of invalid packets (wrong checksum)
>>>> - - that was the cause of the slowdown.
>>>> Simply rebooting the host into 2.6.28.9 fixed *everything*, regardless
>>>> of whether the guests use virtio or ne2k_pci/etc.
>>>> The guests are still running 2.6.29.1, but I am not likely to try that
>>>> release again on the host anytime soon! Ouch!
>>>>   
>>> Strange, no significant tun changes between .28 and .29.
>> Sounds to me like it's this:
>>
>>   
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2f181855a0
>>
>> davem said he was queueing up for stable, but it's not in yet:
>>
>>   http://kerneltrap.org/mailarchive/linux-netdev/2009/3/30/5337934
>>
>> I'll check that it's in the queue.
>>
>> Cheers,
>> Mark.
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to