On Tue, Aug 26, 2025 at 03:43:55PM +0000, Jonathan Cavitt wrote:
> Add error injection support to the function
> vm_bind_ioctl_ops_lock_and_prep.  This necessitates marking the function
> as noinline.
> 
> Signed-off-by: Jonathan Cavitt <jonathan.cav...@intel.com>
> Cc: Matthew Brost <matthew.br...@intel.com>
> Cc: John Harrison <john.c.harri...@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_vm.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 1a8841116e40..e527c90b8c33 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -3201,9 +3201,10 @@ static int vm_bind_ioctl_ops_prefetch_ranges(struct 
> xe_vm *vm, struct xe_vma_ops
>       return 0;
>  }
>  
> -static int vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec,
> -                                        struct xe_vm *vm,
> -                                        struct xe_vma_ops *vops)
> +static noinline int

Ideally in [1] we'd have something like this:

#ifdef CONFIG_FUNCTION_ERROR_INJECTION
#define error_injectable noinline
#else
#define error_injectable
#endif

That might take sometime to get through, but in the meantine can we
define something on the Xe side for this?

[1] 
https://elixir.bootlin.com/linux/v6.16.3/source/include/asm-generic/error-injection.h

> +vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec,
> +                             struct xe_vm *vm,
> +                             struct xe_vma_ops *vops)
>  {
>       struct xe_vma_op *op;
>       int err;
> @@ -3220,6 +3221,7 @@ static int vm_bind_ioctl_ops_lock_and_prep(struct 
> drm_exec *exec,
>  
>       return 0;
>  }
> +ALLOW_ERROR_INJECTION(vm_bind_ioctl_ops_lock_and_prep, ERRNO);
>  

We absolutely need the same injection points which are removed in patch
#1 + an IGT with the same coverage. Please add similar points. VM binds
are a deep software pipeline that can be unwound at any point of
failure. Different injection points trigger various unwind flows that
need to be tested. It took me a long time to get this right, is easy to
break, so we need good testing in place.

Matt

>  static void op_trace(struct xe_vma_op *op)
>  {
> -- 
> 2.43.0
> 

Reply via email to