On Tue, Oct 05, 2004 at 11:43:34AM -0700, Liam Helmer wrote:
> I've actually been trying to track down a bug with 2.6.8(.1), vserver
> 1.9.2, and squashfs. I'm getting a page allocation failure as well, and
> and occasional (temporary until reboot) corruption of a squashfs
> filesystem. I was trying to track it down with Philip, who built
> squashfs.
> 
> The basic issue is that after allocating a lot of cache to a squashfs
> filesystem, the filesystem is unable to allocate any more RAM to new
> squashfs filesystems. It's basically acting as though it's running out
> of allocatable kernel memory (which squashfs has to allocate in in 64k
> continuous chunks). His suggestion was that it was a case of high memory
> fragmentation. However, this was on a box with 512MB of RAM, after
> running it in maintenance mode (without any running processes), and
> simply filling up the cache. I was getting a page allocation, order 4,
> at this point.

> Herbert: Is there anything about the way that memory is allocated in
> vserver, especially when caching data from disk (or, potentially, from
> the network, as we're seeing here) that prevents that memory from being
> re:used elsewhere? I remember you fixed an in 1.9.1rc4 where filesystem
> cache for squashfs was being marked for a particular XID, and thus if a
> file was first accessed in a vserver, then other vservers were unable to
> use that file. Could there be anything similar here?

not that I'm aware of, but memory fragmentation
is a general problem (see lkml and Macelo working on
that stuff), so I'd suggest to try with a vanilla
kernel (no vserver patches) and wait until the
order-4 allocation fails (which is probably after
a few hours of stress testing)

anyway, I'll revisit the vserver code in the next
view days and will have a deep look for possible
connections ...

HTH,
Herbert

> Here's the oops that I'm getting:
> 
> mount: page allocation failure. order:4, mode:0xd0
>  [<c0141153>] __alloc_pages+0x243/0x350
>  [<c014127f>] __get_free_pages+0x1f/0x40
>  [<c014448f>] kmem_getpages+0x1f/0xd0
>  [<c014501a>] cache_grow+0xaa/0x160
>  [<c0145236>] cache_alloc_refill+0x166/0x210
>  [<c01456a7>] __kmalloc+0x67/0x90
>  [<c01e9a0b>] squashfs_fill_super+0x69b/0xbe0
>  [<c025f1d9>] snprintf+0x19/0x20
>  [<c0190c91>] disk_name+0x71/0x80
>  [<c015ffd0>] get_sb_bdev+0xf0/0x130
>  [<c0140fcf>] __alloc_pages+0xbf/0x350
>  [<c0141163>] __alloc_pages+0x253/0x350
>  [<c01eb90e>] squashfs_get_sb+0x1e/0x30
>  [<c01e9370>] squashfs_fill_super+0x0/0xbe0
>  [<c01601e8>] do_kern_mount+0x88/0x170
>  [<c01760dd>] do_new_mount+0x9d/0xe0
>  [<c01767d7>] do_mount+0x187/0x1e0
>  [<c0141163>] __alloc_pages+0x253/0x350
>  [<c026002e>] copy_from_usother er+0x3e/0x70
>  [<c01765fc>] copy_mount_options+0x4c/0xa0
>  [<c0176bbf>] sys_mount+0x9f/0x130
>  [<c0105e77>] syscall_call+0x7/0xb
> SQUASHFS error: Failed to allocate read_page block
> 
> Cheers,
> Liam
> 
> On Tue, 2004-10-05 at 19:16 +0200, Yann.Dupont wrote: 
> > Herbert Poetzl wrote:
> > 
> > >
> > >depends on the hardware, and of course on the usage pattern,
> > >because one instance of postfix will not cause _any_ memory
> > >pressure on a typical system ...
> > >
> > >  
> > >
> > 
> > Yes. The other machine with 2.6 kernel also have vservers but with much 
> > lighter usage
> > 
> > >>Is there a chance there are allocations on vserver code that can affect 
> > >>this ?
> > >>    
> > >>
> > >
> > >sure, and it might even be the cause (especially because
> > >it for sure adds to the memory pressure) ...
> > >  
> > >
> > Ok, so do you think vserver IS the culprit, because of a bug in vserver 
> > 2.6 memory allocation code,  Or (much more probable)
> > vserver just triggers the bug because the vservers, in a cumulative 
> > effect are BIG consummer of memory ?
> > In that case, vserver is not the bug owner, it's the 2.6 kernel in 
> > itself that have defficiencies in that case...
> > 
> > Maybe time to buy an opteron system :) Or just see the 2G/2G option I 
> > already seen discussed in linux kernel mailing lists...
> > And posting in linux kernel mailing list.
> > 
> > Thanks for your response...
> > 
> > Oh, btw, You may be interrested by the fact we have now  more than 100 
> > vservers deployed... Thanks again (you & others) for this
> > very nice piece of software...
> 
> -- 
> Liam Helmer <[EMAIL PROTECTED]>
_______________________________________________
Vserver mailing list
[EMAIL PROTECTED]
http://list.linux-vserver.org/mailman/listinfo/vserver

Reply via email to