Re: [Developer] [zfs] Re: [Review] #3525 Persistent L2ARC - UPDATED

Matthew Ahrens Mon, 14 Oct 2013 17:22:45 -0700

On Mon, Oct 14, 2013 at 4:06 PM, Saso Kiselkov <[email protected]>wrote:


> On 10/14/13 11:52 PM, Matthew Ahrens wrote:
> >     What happened was that the kernel addresses space got
> >     heavily fragmented and the ARC wasn't feeling much pressure to
> contract,
> >     so larger allocations started failing.
> >
> >
> > How did you come to that conclusion?  E.g. how did you measure kernel
> > address space fragmentation?
> >
> > I wonder what exactly the OP meant by "VM slab allocator is quite
> > fragmented".  It may mean that there are many buffers not in use (e.g.
> > as measured by ::kmastat's "buf total" - "buf in use").  This does not
> > indicate virtual address space fragmentation.
>
> I came to this conclusion after analyzing the above noted deadlock
> situation. Also, I misspoke as to my hypothesis on what went wrong. This
> was recorded on a 64-bit machine that had fragmented *physical* memory
> and any attempt to KM_SLEEP allocate a larger buffer resulted in a
> deadlock (due to the chain I outlined previously). That having been
> said, I will freely admit that I haven't seen this happen in the field
> myself and I could be quite wrong on large memory allocations becoming a
> problem.
>

I'm not super familiar with the VM subsystem.  Can you explain where
physical memory fragmentation comes into play?  It looks to me like
segkmem_xalloc() will call segkmem_page_create() (it's the
page_create_func), which calls page_create_va(), which looks for N physical
pages, but they need not be contiguous.

The stack trace from the message you mentioned looks like it hasn't gotten
to allocating physical memory yet.  It looks like it is trying to allocate
virtual address space (segkmem_xalloc() calling vmem_alloc()).  Which would
seem to indicate that the kernel's virtual address space *is* very
fragmented, even on 64-bit.  This is a little surprising to me, but as I
mentioned I'm not a VM expert.

--matt

for reference, the stack mentioned in the email you cited:

ffffff00f4a18f60 cv_wait+0x61(ffffff21b585b01e, ffffff21b585b020)****

ffffff00f4a190a0 vmem_xalloc+0x635(ffffff21b585b000, 20000, 1000, 0, 0, 0,
0, 4****

)****

ffffff00f4a19100 vmem_alloc+0x161(ffffff21b585b000, 20000, 4)****

ffffff00f4a19190 segkmem_xalloc+0x90(ffffff21b585b000, 0, 20000, 4, 0,****

fffffffffb88d2a8, fffffffffbceddc0)****

ffffff00f4a191f0 segkmem_alloc_vn+0xcd(ffffff21b585b000, 20000, 4,****

fffffffffbceddc0)****

ffffff00f4a19220 segkmem_zio_alloc+0x24(ffffff21b585b000, 20000, 4)****




> > I see your fundamental premise as:
> >
> > In some situations, a single large memory allocation may fail, but
> > several smaller allocations (totaling the same amount of memory) would
> > succeed.  This situation may occur when the zfs kernel module is loaded.
> >
> > Would you agree?
>
> Pretty much, yes.
>
> > To me, this premise is plausible in a virtual
> > address-constrained system (e.g. 32-bit), if the kernel module is loaded
> > some time after booting (e.g. on Linux).  Are you addressing a 32-bit
> > only problem, or do you contend that the problem also exists on 64-bit
> > systems?
>
> My initial impression was that this problem might exist on 64-bit as
> well, given the above bug description. Simply pushing the memory request
> out to some other parts of the address space may not help due to there
> not *physically* being a large enough chunk. Yes, the memory manager can
> swap stuff out or move stuff around to fulfill it, but only if the
> request were KM_SLEEP (or from userspace). Perhaps simply switching over
> to a KM_SLEEP allocation might solve the problem (though I'd be wary of
> trying to allocate 1GB+ in a single request in any case).
>
> Cheers,
> --
> Saso
>

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Re: [Developer] [zfs] Re: [Review] #3525 Persistent L2ARC - UPDATED

Reply via email to