> On 30 Dec 2014, at 22:49, Mark Kettenis <[email protected]> wrote:
>
>> From: David Gwynne <[email protected]>
>> Date: Tue, 30 Dec 2014 21:53:46 +1000
>>
>>> On 29 Dec 2014, at 7:55 am, Mark Kettenis <[email protected]> wrote:
>>>
>>>> Date: Sun, 28 Dec 2014 20:56:33 +0100 (CET)
>>>> From: Mark Kettenis <[email protected]>
>>>>
>>>> P.S. I would not be surprised if this would "fix" the problems landisk
>>>> has with the "* 8" pool diff. Currently updating my landisk to
>>>> verify.
>>>
>>> And it seems this does fix the landisk problem. I strongly suspect
>>> there is some cache nastyness going on.
>>
>> i just got this with your diff + my pool garbage collector diff:
>>
>> uvm_fault(0xffffffff819042c0, 0x0, 0, 2) -> e
>> kernel: page fault trap, code=0
>> Stopped at km_free+0x72: movq $0,0(%rax)
>> ddb{4}> tr
>> km_free() at km_free+0x72
>> pool_large_free() at pool_large_free+0x7e
>> pool_allocator_free() at pool_allocator_free+0x30
>> pool_p_free() at pool_p_free+0xd3
>> pool_gc_pages() at pool_gc_pages+0xc9
>> taskq_thread() at taskq_thread+0x3e
>> end trace frame: 0x0, count: -6
>>
>> ive been running the gc in production for a week so far and havent
>> hit this, so i think its new with your diff. however, i havent
>> looked into it yet so i could be wrong.
>
> Hmm, I guess you have a pool that has dma constraints and uses in-page
> headers? That would actually prevent the direct map from being used,
> but km_free would not realize this, and try to unmap the direct
> mapping.
>
> Arguably that bug is already present in the current code. We have two
> options:
>
> 1. Make km_free(9) respect the kv_align attribute and skip the direct
> mapping code path. Should work, since the pool allocator code is
> careful enough to set the alignment for the km_free() calls. Would
> make things slightly fragile since people might forget though. But
> perhaps it is a good to have things crash in that case.
>
> 2. Align the physical pages such that mapping them directly produces
> the desired alignment. Increases the risk of allocations failing.
> Probably not an issue for smallish alignment values. This may be
> our only option if we don't find the root cause of the landisk
> issue. Obviously this has the benefit of using the direct map in
> more cases than we do now.
>
> The diff below implements #2.
despite (at least one) landisk not liking the *8 change in pools, wouldnt there
still be value in this diff?
dlg
>
>
> Index: uvm_km.c
> ===================================================================
> RCS file: /home/cvs/src/sys/uvm/uvm_km.c,v
> retrieving revision 1.123
> diff -u -p -r1.123 uvm_km.c
> --- uvm_km.c 17 Dec 2014 06:58:11 -0000 1.123
> +++ uvm_km.c 30 Dec 2014 12:18:54 -0000
> @@ -809,11 +809,9 @@ km_alloc(size_t sz, const struct kmem_va
> struct pglist pgl;
> int mapflags = 0;
> vm_prot_t prot;
> + paddr_t pla_align;
> int pla_flags;
> int pla_maxseg;
> -#ifdef __HAVE_PMAP_DIRECT
> - paddr_t pa;
> -#endif
> vaddr_t va, sva;
>
> KASSERT(sz == round_page(sz));
> @@ -828,46 +826,33 @@ km_alloc(size_t sz, const struct kmem_va
> if (kp->kp_zero)
> pla_flags |= UVM_PLA_ZERO;
>
> + pla_align = kp->kp_align;
> +#ifdef __HAVE_PMAP_DIRECT
> + if (pla_align < kv->kv_align)
> + pla_align = kv->kv_align;
> +#endif
> pla_maxseg = kp->kp_maxseg;
> if (pla_maxseg == 0)
> pla_maxseg = sz / PAGE_SIZE;
>
> if (uvm_pglistalloc(sz, kp->kp_constraint->ucr_low,
> - kp->kp_constraint->ucr_high, kp->kp_align, kp->kp_boundary,
> + kp->kp_constraint->ucr_high, pla_align, kp->kp_boundary,
> &pgl, pla_maxseg, pla_flags)) {
> return (NULL);
> }
>
> #ifdef __HAVE_PMAP_DIRECT
> - if (kv->kv_align)
> - goto alloc_va;
> -#if 1
> /*
> - * For now, only do DIRECT mappings for single page
> - * allocations, until we figure out a good way to deal
> - * with contig allocations in km_free.
> + * Only use direct mappings for single page or single segment
> + * allocations.
> */
> - if (!kv->kv_singlepage)
> - goto alloc_va;
> -#endif
> - /*
> - * Dubious optimization. If we got a contig segment, just map it
> - * through the direct map.
> - */
> - TAILQ_FOREACH(pg, &pgl, pageq) {
> - if (pg != TAILQ_FIRST(&pgl) &&
> - VM_PAGE_TO_PHYS(pg) != pa + PAGE_SIZE)
> - break;
> - pa = VM_PAGE_TO_PHYS(pg);
> - }
> - if (pg == NULL) {
> + if (kv->kv_singlepage || kp->kp_maxseg == 1) {
> TAILQ_FOREACH(pg, &pgl, pageq) {
> - vaddr_t v;
> - v = pmap_map_direct(pg);
> + va = pmap_map_direct(pg);
> if (pg == TAILQ_FIRST(&pgl))
> - va = v;
> + sva = va;
> }
> - return ((void *)va);
> + return ((void *)sva);
> }
> #endif
> alloc_va:
> @@ -947,28 +932,35 @@ km_free(void *v, size_t sz, const struct
> struct vm_page *pg;
> struct pglist pgl;
>
> - sva = va = (vaddr_t)v;
> - eva = va + sz;
> + sva = (vaddr_t)v;
> + eva = sva + sz;
>
> - if (kp->kp_nomem) {
> + if (kp->kp_nomem)
> goto free_va;
> - }
>
> - if (kv->kv_singlepage) {
> #ifdef __HAVE_PMAP_DIRECT
> - pg = pmap_unmap_direct(va);
> - uvm_pagefree(pg);
> + if (kv->kv_singlepage || kp->kp_maxseg == 1) {
> + TAILQ_INIT(&pgl);
> + for (va = sva; va < eva; va += PAGE_SIZE) {
> + pg = pmap_unmap_direct(va);
> + TAILQ_INSERT_TAIL(&pgl, pg, pageq);
> + }
> + uvm_pglistfree(&pgl);
> + return;
> + }
> #else
> + if (kv->kv_singlepage) {
> struct uvm_km_free_page *fp = v;
> +
> mtx_enter(&uvm_km_pages.mtx);
> fp->next = uvm_km_pages.freelist;
> uvm_km_pages.freelist = fp;
> if (uvm_km_pages.freelistlen++ > 16)
> wakeup(&uvm_km_pages.km_proc);
> mtx_leave(&uvm_km_pages.mtx);
> -#endif
> return;
> }
> +#endif
>
> if (kp->kp_pageable) {
> pmap_remove(pmap_kernel(), sva, eva);