On 07.07.25 20:12, Matthew Brost wrote: >>> Then it occurs to me this looks like a common mistake to make. A little >>> bit of git grep on dma_fence_wait_timeout() quickly finds multiple >>> similar mistakes in drm, at least amdgpu, etnaviv, msm, and tegra. Cc >>> some maintainers FYI. This class of bugs could cause issues elsewhere. >> >> I can only guess but I think all those use cases use a fixed small timeout >> as well. IIRC amdgpu uses a timeout in the millisecond range. >> > > Ah, no. A quick grep shows multiple cases in AMDGPU where long + > MAX_SCHEDULE_TIMEOUT is used: > > - amdgpu_vm_cpu_update > - amdgpu_vm_wait_idle > - amdgpu_bo_kmap > - amdgpu_hmm_invalidate_gfx > - amdgpu_gem_wait_idle_ioctl > > I'm pretty confused by the pushback here—it's among the most confusing > I've ever seen.
Those cases are perfectly ok. long + MAX_SCHEDULE_TIMEOUT is correct. The problem only occurs when you use MAX_SCHEDULE_TIMEOUT and push the return value into an int. > Again, this patch is almost certainly correct. I've made this mistake at > least twice myself. The problem seems to be that i915 is using a small value, while XE is using MAX_SCHEDULE_TIMEOUT. No idea why that difference is, but when you look only at the i915 code it looks unnecessary. Regards, Christian. >>> Let's also Cc Julia and Dan in case they have ideas to improve static >>> analysis to catch this class of bugs. Or maybe one exists already, but >>> we're just not running them! >> >> Yeah, agree. A script which checks if the input timeout could be >32bit and >> the return value is assigned to something <=32bits is probably a really good >> idea. >> > > Yes, I agree. -Wstrict-overflow or -Wconversion in GCC would catch this, > but enabling them causes the core kernel includes to quickly explode. A > static checker would be a good solution here, or we could fix the Linux > kernel so it compiles cleanly with these options. > > Matt > >> Regards, >> Christian. >> >>> >>> Finally, for the actual change, >>> >>> Reviewed-by: Jani Nikula <jani.nik...@intel.com> >>> >>> >>>> Signed-off-by: Aakash Deep Sarkar <aakash.deep.sar...@intel.com> >>>> --- >>>> drivers/gpu/drm/i915/display/intel_display.c | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/display/intel_display.c >>>> b/drivers/gpu/drm/i915/display/intel_display.c >>>> index 456fc4b04cda..7035c1fc9033 100644 >>>> --- a/drivers/gpu/drm/i915/display/intel_display.c >>>> +++ b/drivers/gpu/drm/i915/display/intel_display.c >>>> @@ -7092,7 +7092,8 @@ static void intel_atomic_commit_fence_wait(struct >>>> intel_atomic_state *intel_stat >>>> struct drm_i915_private *i915 = to_i915(intel_state->base.dev); >>>> struct drm_plane *plane; >>>> struct drm_plane_state *new_plane_state; >>>> - int ret, i; >>>> + long ret; >>>> + int i; >>>> >>>> for_each_new_plane_in_state(&intel_state->base, plane, new_plane_state, >>>> i) { >>>> if (new_plane_state->fence) { >>> >>