On Sunday 12 February 2006 16:58, Anthony Brock wrote: > I recently downloaded and started using kernel 2.6.15.3 with the bs2 > patches from BlaisorBlade's website. Most of my instances continued to run > as normal, however I started experiencing outages on a couple of them this > morning. Specifically, two of the instances stopped responding to network > and console input. They also ignored 'cad' commands from the uml_mconsole > utility. A third dumped the following (10 times) to its console immediately > before it started ignoring input:
> swapper: page allocation failure. order:0, mode:0x20 > 0821f2c8: [<0808a2c3>] __alloc_pages+0x274/0x286 > 0821f308: [<0808c8b5>] kmem_getpages+0x4a/0x9f > 0821f31c: [<0808d384>] cache_grow+0x96/0x122 > 0821f350: [<0808d563>] cache_alloc_refill+0x153/0x186 > 0821f374: [<0808d7ac>] __kmalloc+0x5b/0x6d > 0821f390: [<081717a0>] __alloc_skb+0x52/0x129 > 0821f3ac: [<080609ba>] uml_net_rx+0x1e/0x13c > 0821f3cc: [<08060afc>] uml_net_interrupt+0x1f/0x91 > 0821f410: [<08085a2b>] handle_IRQ_event+0x24/0x54 > 0821f434: [<08085aae>] __do_IRQ+0x53/0x91 > 0821f448: [<08057bc8>] do_IRQ+0x20/0x2c > 0821f450: [<08057d5d>] sigio_handler+0xa5/0xcf > 0821f468: [<0805e485>] sig_handler_common_skas+0xa5/0xbe > 0821f48c: [<0806798e>] sig_handler+0xe/0x11 > 0821f5b8: [<0805a555>] change_signals+0x30/0x50 > 0821f5e0: [<0805a694>] set_signals+0xa4/0xc0 > 0821f688: [<0805a694>] set_signals+0xa4/0xc0 > 0821f6d0: [<0805a555>] change_signals+0x30/0x50 > 0821f71c: [<0805a555>] change_signals+0x30/0x50 > 0821f730: [<0805a555>] change_signals+0x30/0x50 > 0821f758: [<0805a694>] set_signals+0xa4/0xc0 > 0821f7c0: [<08066632>] file_io+0x19/0xa7 > 0821f7d4: [<0805a513>] change_sig+0x44/0x56 > 0821f814: [<0805a555>] change_signals+0x30/0x50 > 0821f8c8: [<08072ba4>] __do_softirq+0x34/0x8d > 0821f8dc: [<08072c2c>] do_softirq+0x2f/0x39 > 0821f8e8: [<08072cc7>] irq_exit+0x2c/0x2e > 0821f8f0: [<0805b1ae>] timer_handler+0x2e/0x49 > 0821f904: [<0805e485>] sig_handler_common_skas+0xa5/0xbe > 0821f920: [<0805e091>] start_kernel_proc+0x0/0x2a > 0821f928: [<080679b8>] alarm_handler+0x27/0x3b > 0821f950: [<0805e091>] start_kernel_proc+0x0/0x2a > 0821fa18: [<0805dde5>] new_thread_handler+0x0/0xa7 > 0821fa28: [<0805d87e>] new_thread+0x61/0x69 > 0821fabc: [<08056336>] _einittext+0x1360/0x1956 > 0821fad0: [<08055590>] _einittext+0x5ba/0x1956 > 0821fad8: [<08056045>] _einittext+0x106f/0x1956 > 0821fb10: [<0805d721>] switch_threads+0x36/0x3d > 0821fb38: [<0805a555>] change_signals+0x30/0x50 > 0821fbec: [<081d38b7>] schedule+0x411/0x464 > 0821fc14: [<0805aeb9>] idle_sleep+0x1d/0x21 > 0821fc28: [<080591fa>] default_idle+0x43/0x46 > 0821fc34: [<0805e08e>] init_idle_skas+0x20/0x23 > 0821fc40: [<080494cd>] start_kernel+0x166/0x16a > 0821fc50: [<0805e0b7>] start_kernel_proc+0x26/0x2a > 0821fc58: [<080678ae>] run_kernel_thread+0x30/0x3b > 0821fc68: [<0805e091>] start_kernel_proc+0x0/0x2a > 0821fc74: [<08067898>] run_kernel_thread+0x1a/0x3b > 0821fd00: [<0805de64>] new_thread_handler+0x7f/0xa7 > 0821fd04: [<0805e091>] start_kernel_proc+0x0/0x2a > > Mem-info: > DMA per-cpu: > cpu 0 hot: low 0, high 18, batch 3 used:2 > cpu 0 cold: low 0, high 6, batch 1 used:0 > DMA32 per-cpu: empty > Normal per-cpu: empty > HighMem per-cpu: empty > Free pages: 384kB (0kB HighMem) > Active:4946 inactive:8456 dirty:1087 writeback:3842 unstable:0 free:96 > slab:1601 mapped:8340 pagetables:98 > DMA free:384kB min:1024kB low:1280kB high:1536kB active:19784kB > inactive:33824kB present:65536kB pages_scanned:33 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 > DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB > pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 > Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB > present:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 > HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB > present:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 > DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 0*1024kB > 0*2048kB 0*4096kB = 384kB > DMA32: empty > Normal: empty > HighMem: empty > Swap cache: add 655, delete 23, find 0/0, race 0+0 > Free swap = 128440kB > Total swap = 131060kB > Free swap: 128440kB > 16384 pages of RAM > 0 pages of HIGHMEM > 952 reserved pages > 13192 pages shared > 632 pages swap cached > This instance is on an older SuSE 9.1 image. However, the other instances > were Debian. Also, the host kernel is 2.6.13.2-skas3-v8.2. I've downgraded > the guest instances to 2.6.15.1-bs1 which still appears to be stable. Could you upgrade them to 2.6.15.3-bs1 - to verify that indeed that's still safe, too? > Any > ideas? Would further information help? If needed, I can make the guest > kernel configuration and the other 9 copies of the error message available. I cannot work on this right now as I'm really busy, I'll try to do that after Wednesday, when I do my last exam for this session. However, I've written below a few ideas to start identifying what may be the cause of this bug. Since it's complaining that it's not able to free a single page of memory and it's due to the new kernel, it would seem a memory leak somewhere. So, I'd like to see on a hanging instance (or almost hanged) the output of "proc meminfo" and (especially) "proc slabinfo" from mconsole; since it's likely they won't work at the right moment, the instance should be monitored to do that. Better yet, run these commands from a console in the guest, say like this: (i=0; while :; do cat /proc/meminfo; echo; cat /proc/slabinfo; echo -e "\t$i"; let i++ sleep 60; done) Meanwhile, I'm forwarding this to Jeff Dike as all the newly added patches had just been merged by him to mainline - maybe the bug triggered for my backport (I did a selection of what was merged), maybe not. Another note: after reading the "ChangeLog" in my "NEWS" page for -bs2 (from which it can be seen which areas were touched), since you say only some machines hang, I want to see why some UML instances hang and some others not. Do they behave differently so that they trigger the introduced bug? I.e., do you know of any particular behaviour of programs in the hanging instances? Thanks -- Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!". Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894) http://www.user-mode-linux.org/~blaisorblade ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ User-mode-linux-user mailing list User-mode-linux-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user