Hi Krzysztof,

On Monday, 23 February 2026 16:21:48 CET Krzysztof Karas wrote:
> Migration testing in i915 uses current task's address space to
> allocate new userspace mapping, without registering real user
> for that address space in mm_struct. Since PCI probe executes in
> another process, "current" is unknown at the time the selftest
> run [1].

I think that statement is too general.  PCI probe usually executes the
driver's .probe function within the current kernel thread, though it 
*may* decide to pass its execution to a kworker if it detects the probed 
device's resources belong to a different NUMA cell in a multi-cell NUMA 
system.  Do we know other conditions when that may happen?  Please 
specify all such conditions we know about instead of potentially 
suggesting that the delegation to a kworker is unconditional.

Thanks,
Janusz

> 
> It was observed that mm->mm_users would occasionally be 0
> or drop to 0 during the test, which reaped userspace mappings,
> further leading to failures upon reading from userland memory.
> 
> Prevent this by waiting for usable address space and
> artificially increasing its mm_users counter for the duration
> of the test.
> 
> [1]:
> IGT makes use of selftest on module load mechanism in i915:
>  1) IGT recognizes arguments and passes them to libkmod while in
>   userspace;
>  2) libkmod forms and executes a syscall simulating a
>   modprobe command, thus moving to kernel context;
>  3) PCI code puts local_pci_probe() call onto a workqueue via
>   work_on_cpu() macro in pci_call_probe(). Below backtrace shows
>   call order between syscall and work_on_cpu() call:
> 
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0xc1/0xf0
>  dump_stack+0x10/0x20
>  pci_device_probe+0x205/0x280
>  really_probe+0xf1/0x410
>  __driver_probe_device+0x8c/0x190
>  driver_probe_device+0x24/0xd0
>  __driver_attach+0x10f/0x240
>  ? __pfx___driver_attach+0x10/0x10
>  bus_for_each_dev+0x7f/0xe0
>  driver_attach+0x1e/0x30
>  bus_add_driver+0x163/0x2a0
>  driver_register+0x5e/0x130
>  __pci_register_driver+0x80/0xa0
>  i915_pci_register_driver+0x23/0x30 [i915]
>  i915_init+0x37/0x120 [i915]
>  ? __pfx_i915_init+0x10/0x10 [i915]
>  do_one_initcall+0x5e/0x3a0
>  do_init_module+0x97/0x2b0
>  load_module+0x2d89/0x2e90
>  ? kernel_read_file+0x2b1/0x320
>  init_module_from_file+0xf4/0x120
>  ? init_module_from_file+0xf4/0x120
>  idempotent_init_module+0x117/0x330
>  __x64_sys_finit_module+0x73/0xf0
>  x64_sys_call+0x1d68/0x26b0
>  do_syscall_64+0x93/0x1470
>  ? do_syscall_64+0x1e4/0x1470
>  ? ksys_lseek+0x4f/0xd0
>  ? do_syscall_64+0x1e4/0x1470
>  ? clear_bhb_loop+0x50/0xa0
>  ? clear_bhb_loop+0x50/0xa0
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> note that pci_device_probe() calls __pci_device_probe(), which
> then calls pci_call_probe() and both are static functions;
>  4) at last, the actual probe is called from kworker and the
>  selftests execute:
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0xc1/0xf0
>  dump_stack+0x10/0x20
>  igt_mmap_migrate+0x302/0x7e0 [i915]
>  __i915_subtests+0xb8/0x250 [i915]
>  ? __pfx___i915_live_teardown+0x10/0x10 [i915]
>  ? __pfx___i915_live_setup+0x10/0x10 [i915]
>  i915_gem_mman_live_selftests+0x103/0x140 [i915]
>  __run_selftests+0xc5/0x220 [i915]
>  i915_live_selftests+0xaa/0x130 [i915]
>  i915_pci_probe+0xee/0x1d0 [i915]
>  local_pci_probe+0x47/0xb0
>  work_for_cpu_fn+0x1a/0x30
>  process_one_work+0x22e/0x6b0
>  worker_thread+0x1e8/0x3d0
>  ? __pfx_worker_thread+0x10/0x10
>  kthread+0x11f/0x250
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork+0x344/0x3a0
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork_asm+0x1a/0x30
> 
> This operation of putting selftests into a forked process
> creates a short delay in which another userspace process may be
> handled by the scheduler, so IGT process is not the one from
> which kernel borrows the address space.
> 
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14204
> Fixes: 34b1c1c71d37 ("i915/selftest/igt_mmap: let mmap tests run in 
kthread")
> Signed-off-by: Krzysztof Karas <[email protected]>
> ---
> This has previously been a regular patch, but after many
> deliberations and discussions with i915 team members we decided
> that this should be made into a RFC, as this may still be an
> incomplete solution.
> 
> This change is best effort to increase reliability of the
> selftest. There will still be instances, where there is no
> suitable address space available, because we rely on operating
> system to be in a certain state, which we probably should not
> force by ourselves. In this case, it would be sufficient to have
> this test pass most of the time and silently skip if it cannot
> execute safely.
> 
>  .../drm/i915/gem/selftests/i915_gem_mman.c    | 44 ++++++++++++++++---
>  1 file changed, 39 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/
gpu/drm/i915/gem/selftests/i915_gem_mman.c
> index 9d454d0b46f2..ccb00cd5e750 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
> @@ -38,6 +38,8 @@ struct tile {
>       unsigned int swizzle;
>  };
>  
> +bool skip_vma = true;
> +
>  static u64 swizzle_bit(unsigned int bit, u64 offset)
>  {
>       return (offset & BIT_ULL(bit)) >> (bit - 6);
> @@ -903,6 +905,9 @@ static int __igt_mmap(struct drm_i915_private *i915,
>       int err, i;
>       u64 offset;
>  
> +     if (skip_vma)
> +             return 0;
> +
>       if (!can_mmap(obj, type))
>               return 0;
>  
> @@ -1165,6 +1170,9 @@ static int __igt_mmap_migrate(struct 
intel_memory_region **placements,
>       u64 offset;
>       int err;
>  
> +     if (skip_vma)
> +             return 0;
> +
>       obj = __i915_gem_object_create_user(i915, PAGE_SIZE,
>                                           placements,
>                                           n_placements);
> @@ -1847,7 +1855,6 @@ static int igt_mmap_revoke(void *arg)
>  int i915_gem_mman_live_selftests(struct drm_i915_private *i915)
>  {
>       int ret;
> -     bool unuse_mm = false;
>       static const struct i915_subtest tests[] = {
>               SUBTEST(igt_partial_tiling),
>               SUBTEST(igt_smoke_tiling),
> @@ -1860,14 +1867,41 @@ int i915_gem_mman_live_selftests(struct 
drm_i915_private *i915)
>       };
>  
>       if (!current->mm) {
> -             kthread_use_mm(current->active_mm);
> -             unuse_mm = true;
> +             int timeout = 10;
> +             /**
> +              * We want to use current->active_mm, which corresponds to 
the
> +              * address space of a userspace process that was last 
handled by
> +              * scheduler. It is possible that this memory is in the 
middle
> +              * of cleanup indicated by mm_users == 0, in which case 
kernel
> +              * waits until the process is finished to release its 
mm_struct.
> +              * Borrowing that memory at that point is unsafe, because 
it may
> +              * be freed at any point and taking additional references 
to it
> +              * will not change the cleanup behavior.
> +              * Wait for a bit in hopes that another process is taken 
by
> +              * scheduler and has reliable memory for us to map into.
> +              */
> +             while (timeout--) {
> +                     if (mmget_not_zero(current->active_mm)) {
> +                             kthread_use_mm(current->active_mm);
> +                             skip_vma = false;
> +                             break;
> +                     }
> +
> +                     msleep(1000);
> +             }
>       }
>  
>       ret = i915_live_subtests(tests, i915);
>  
> -     if (unuse_mm)
> -             kthread_unuse_mm(current->active_mm);
> +     if (!skip_vma) {
> +             /**
> +              * The scheduler was working while the test executed,
> +              * so current->active_mm might have changed. Use the old
> +              * reference to the address space stored in current->mm.
> +              */
> +             mmput_async(current->mm);
> +             kthread_unuse_mm(current->mm);
> +     }
>  
>       return ret;
>  }
> 




Reply via email to