On Wed, Mar 04, 2015 at 02:55:17PM +0200, Mika Kuoppala wrote:
> If the requested size is less than what the full range
> of pdps can address, we end up setting pdps for only the
> requested area.
> 
> The logical context however needs all pdp entries to be valid.
> Prior to commit 06fda602dbca ("drm/i915: Create page table allocators")
> we have been writing pdp entries with dma address of zero instead
> of valid pdps. This is supposedly bad even if those pdps are not
> addressed.
> 
> As commit 06fda602dbca ("drm/i915: Create page table allocators")
> introduced more dynamic structure for pdps, we ended up oopsing
> when we populated the lrc context. Analyzing this oops revealed
> the fact that we have not been writing valid pdps with bsw, as
> it is doing the ppgtt init with 2GB limit in some cases.
> 
> We should do the right thing and setup the non addressable part
> pdps/pde/pte to scratch page through the minimal structure by
> having just pdp with pde entries pointing to same page with
> pte entries pointing to scratch page.
> 
> But instead of going through that trouble, setup all the pdps
> through individual pd pages and pt entries, even for non
> addressable parts. And let the clear range point them to scratch
> page. This way we populate the lrc with valid pdps and wait
> for dynamic page allocation work to land, and do the heavy lifting
> for truncating page table tree according to usage.
> 
> The regression of oopsing in init was introduced by
> commit 06fda602dbca ("drm/i915: Create page table allocators")
> 
> v2: Clear the range for the unused part also (Ville)
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89350
> Cc: Michel Thierry <[email protected]>
> Cc: Ben Widawsky <[email protected]>
> Cc: Ville Syrjälä <[email protected]>
> Tested-by: Valtteri Rantala <[email protected]>
> Signed-off-by: Mika Kuoppala <[email protected]>

Seems sane enough assuming we're willing to pay the cost.
Reviewed-by: Ville Syrjälä <[email protected]>

And then we could also throw another patch on top to actually increase the
PPGTT size to 4GiB since there's no extra cost anymore :) Well, assuming
everything else is 4GiB safe. But I guess it ought to be if BDW machines
have already been running with 4GiB GGTT/PPGTT.

> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index bd95776..1aa2a2c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -716,15 +716,19 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, 
> uint64_t size)
>       if (size % (1<<30))
>               DRM_INFO("Pages will be wasted unless GTT size (%llu) is 
> divisible by 1GB\n", size);
>  
> -     /* 1. Do all our allocations for page directories and page tables. */
> -     ret = gen8_ppgtt_alloc(ppgtt, max_pdp);
> +     /* 1. Do all our allocations for page directories and page tables.
> +      * We allocate more than was asked so that we can point the unused parts
> +      * to valid entries that point to scratch page. Dynamic page tables
> +      * will fix this eventually.
> +      */
> +     ret = gen8_ppgtt_alloc(ppgtt, GEN8_LEGACY_PDPES);
>       if (ret)
>               return ret;
>  
>       /*
>        * 2. Create DMA mappings for the page directories and page tables.
>        */
> -     for (i = 0; i < max_pdp; i++) {
> +     for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
>               ret = gen8_ppgtt_setup_page_directories(ppgtt, i);
>               if (ret)
>                       goto bail;
> @@ -744,7 +748,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, 
> uint64_t size)
>        * plugged in correctly. So we do that now/here. For aliasing PPGTT, we
>        * will never need to touch the PDEs again.
>        */
> -     for (i = 0; i < max_pdp; i++) {
> +     for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
>               struct i915_page_directory_entry *pd = 
> ppgtt->pdp.page_directory[i];
>               gen8_ppgtt_pde_t *pd_vaddr;
>               pd_vaddr = kmap_atomic(ppgtt->pdp.page_directory[i]->page);
> @@ -764,9 +768,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, 
> uint64_t size)
>       ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
>       ppgtt->base.cleanup = gen8_ppgtt_cleanup;
>       ppgtt->base.start = 0;
> -     ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * 
> PAGE_SIZE;
>  
> -     ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
> +     /* This is the area that we advertise as usable for the caller */
> +     ppgtt->base.total = max_pdp * GEN8_PDES_PER_PAGE * GEN8_PTES_PER_PAGE * 
> PAGE_SIZE;
> +
> +     /* Set all ptes to a valid scratch page. Also above requested space */
> +     ppgtt->base.clear_range(&ppgtt->base, 0,
> +                             ppgtt->num_pd_pages * GEN8_PTES_PER_PAGE * 
> PAGE_SIZE,
> +                             true);
>  
>       DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d 
> wasted)\n",
>                        ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);
> -- 
> 1.9.1

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to