On Mon, May 18, 2026 at 11:27:46AM +0200, Maarten Lankhorst wrote: > Hey Ville, > > Den 2026-05-11 kl. 23:41, skrev Ville Syrjala: > > From: Ville Syrjälä <[email protected]> > > > > The current assumption that the initial FB offset into stolen and > > GGTT are the same is completely wrong for MTL+. On these platforms > > the GOP always places the FB at start of stolen, but then maps it > > to the top of GGTT. > > > > Read the correct phys_base from the PTE so that we at least take > > over the correct part of the physical memory. > > > > The GGTT offset is more annoying to deal with there. The horrible > > ggtt->start and GUC_GGTT_TOP hacks prevent us from even keeping the > > original GGTT mapping (ggtt->start blocks pre-MTL hardware and > > GUC_GGTT_TOP blcoks MTL+). For now just hack this and remap the > > FB to live at ggtt->start. On MTL+ this might even work correctly > > since we're unlikely to overlap with the original mapping. But on > > earlier platforms we're guaranteed to have an overlap if the FB > > is larger than ggtt->start. Such an overlap will cause visible > > glitches on the screen as the PTEs get overwritten while the > > display hardware is still using them for scanout. > > > > On i915 we don't have the ggtt->start hack and thus can always > > bind the FB to actual start of GGTT. i915 does have the equivalent > > of GUC_GGTT_TOP so it can't leave the mapping to the end of GGTT > > either sadly. > > > > Signed-off-by: Ville Syrjälä <[email protected]> > > --- > > drivers/gpu/drm/xe/display/xe_initial_plane.c | 35 +++++++++++++++---- > > 1 file changed, 28 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/display/xe_initial_plane.c > > b/drivers/gpu/drm/xe/display/xe_initial_plane.c > > index d0a9f8599096..da44f6d1a5f8 100644 > > --- a/drivers/gpu/drm/xe/display/xe_initial_plane.c > > +++ b/drivers/gpu/drm/xe/display/xe_initial_plane.c > > @@ -42,6 +42,7 @@ initial_plane_bo(struct xe_device *xe, > > { > > struct xe_tile *tile0 = xe_device_get_root_tile(xe); > > struct xe_bo *bo; > > + dma_addr_t dma_addr; > > resource_size_t phys_base; > > u32 base, size, flags; > > u64 page_size = xe->info.vram_flags & XE_VRAM_FLAGS_NEED64K ? SZ_64K : > > SZ_4K; > > @@ -64,7 +65,8 @@ initial_plane_bo(struct xe_device *xe, > > return NULL; > > } > > > > - phys_base = pte & ~(page_size - 1); > > + dma_addr = pte & ~(page_size - 1); > > + phys_base = dma_addr; > > > > flags |= XE_BO_FLAG_VRAM0; > > > > @@ -78,10 +80,6 @@ initial_plane_bo(struct xe_device *xe, > > &phys_base); > > return NULL; > > } > > - > > - drm_dbg_kms(&xe->drm, > > - "Using phys_base=%pa, based on initial plane > > programming\n", > > - &phys_base); > > } else { > > struct ttm_resource_manager *stolen; > > u64 pte; > > @@ -99,11 +97,29 @@ initial_plane_bo(struct xe_device *xe, > > return NULL; > > } > > > > - phys_base = base; > > + dma_addr = pte & ~(page_size - 1); > > + phys_base = dma_addr - xe_ttm_stolen_gpu_offset(xe); > > + > > flags |= XE_BO_FLAG_STOLEN; > > } > > > > - bo = xe_bo_create_pin_map_at_novm(xe, tile0, size, phys_base, phys_base, > > + drm_dbg_kms(&xe->drm, > > + "Initial plane dma_addr=%pa phys_base=%pa\n", > > + &dma_addr, &phys_base); > > + > > + /* > > + * Pin to xe_ggtt_start() to avoid conflicting with > > + * the horrible ggtt->start and GUC_GGTT_TOP hacks. > > + * > > + * FIXME this is complete crap. To do this properly we > > + * need to prevent the original PTEs from being overwritten > > + * while bindind to the new address. Any overlap between > > + * the old and new ranges will corrupt the old PTEs that > > + * the display hardware is currently using for scanout. > > + */ > > + base = xe_ggtt_start(tile0->mem.ggtt); > > The comment is mostly accurate in describing your solution. You have all > the pieces to know which part of GGTT are allocatable. Only > xe_ggtt_size() bytes starting at xe_ggtt_start() are allocatable. > This is because xe also supports VF's, which only has a part of the GGTT > usable.
We don't care about VFs here. But ggtt->start != 0 also for the PF for whatever silly reason. > > ggtt->start can be adjusted at runtime when the VF is migrated to a new > location. I'm open for better solutions that are still O(1). > See xe_ggtt_shift_nodes() for some details, or read through the SR-IOV > documentation of the xe module. > > The previous assumption used GGTT address == physical address on all > platforms, which was accurate at the time it was written. Judging from > the comments it now breaks on MTL. It is also broken on older platforms due to that ggtt->start mess. Clearly none of this has ever worked correctly. > > Fortunately, you can derive the exact address of the current allocation. > That makes it easy to handle this correctly. Can you extend > xe_ggtt_insert_node() with a start + end argument, or create > xe_ggtt_insert_node_at() that has those arguments? > > If you then reserve the current allocation in advance, the workaround > that required creating xe_ggtt_insert_bo_at() can then be removed. > > After the BO is mapped, it's safe to call xe_ggtt_node_remove() on the > current location with invalidate set to false. The contents will be > cleared by xe_ggtt_init() later on during the init sequence. If you really want to fix this properly then just get rid of those ggtt->start and GUC_GGTT_TOP hacks and allow the code to just insert the node at the same location where the GOP placed it. Then we wouldn't need any stupid hacks here. -- Ville Syrjälä Intel
