Re: [Linaro-mm-sig] Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
Am 13.03.23 um 17:43 schrieb Rob Clark: On Mon, Mar 13, 2023 at 9:15 AM Christian König wrote: Am 13.03.23 um 15:45 schrieb Rob Clark: On Mon, Mar 13, 2023 at 12:19 AM Christian König wrote: Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Well this is a very very bad idea, we made the same mistake with amdgpu as well. It's true that you should not have any memory allocation in your run_job callback, but you could also just allocate the hw fence during job creation and initializing it later on. I've suggested to embed the fence into the job for amdgpu because some people insisted of re-submitting jobs during timeout and GPU reset. This turned into a nightmare with tons of bug fixes on top of bug fixes on top of bug fixes because it messes up the job and fence lifetime as defined by the DRM scheduler and DMA-buf framework. Luben is currently working on cleaning all this up. This actually shouldn't be a problem with msm, as the fence doesn't change if there is a gpu reset. We simply signal the fence for the offending job, reset the GPU, and re-play the remaining in-flight jobs (ie. things that already had their job_run() called) with the original fences. (We don't use gpu sched's reset/timeout handling.. when I migrated to gpu sched I kept our existing hangcheck/recovery mechanism.) That sounds much saner than what we did. So you basically need the dma_fence reference counting separate to initializing the other dma_fence fields? yeah, that was the idea What would happen if a dma_fence which is not completely initialized gets freed? E.g. because of an error? hmm, yes, this would be a problem since ops->release is not set yet.. and I'm relying on that to free the submit Would it be to much to just keep the handling as it is today and only allocate the dma_fence without initializing it? If necessary we could easily add a dma_fence_is_initialized() function which checks the fence_ops for NULL. Yeah, that would also be possible I guess we could split creation of the fence (initializing ops, refcount) and "arming" it later when the seqno is known? But maybe that is going to too many lengths to avoid a separate allocation.. I would really like to avoid that. It give people the opportunity once more to do multiple "arm" operations on the same fence, and that was a really bad idea for us. So yeah if that's just to avoid the extra allocation it's probably not worth it. Christian. BR, -R Thanks, Christian. BR, -R Regards, Christian. Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(&fctx->spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit *submit = fence_to_submit(fence); + return msm_fence_c
Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
On Mon, Mar 13, 2023 at 9:15 AM Christian König wrote: > > Am 13.03.23 um 15:45 schrieb Rob Clark: > > On Mon, Mar 13, 2023 at 12:19 AM Christian König > > wrote: > >> Am 11.03.23 um 18:35 schrieb Rob Clark: > >>> From: Rob Clark > >>> > >>> Avoid allocating memory in job_run() by embedding the fence in the > >>> submit object. Since msm gpu fences are always 1:1 with msm_gem_submit > >>> we can just use the fence's refcnt to track the submit. And since we > >>> can get the fence ctx from the submit we can just drop the msm_fence > >>> struct altogether. This uses the new dma_fence_init_noref() to deal > >>> with the fact that the fence's refcnt is initialized when the submit is > >>> created, long before job_run(). > >> Well this is a very very bad idea, we made the same mistake with amdgpu > >> as well. > >> > >> It's true that you should not have any memory allocation in your run_job > >> callback, but you could also just allocate the hw fence during job > >> creation and initializing it later on. > >> > >> I've suggested to embed the fence into the job for amdgpu because some > >> people insisted of re-submitting jobs during timeout and GPU reset. This > >> turned into a nightmare with tons of bug fixes on top of bug fixes on > >> top of bug fixes because it messes up the job and fence lifetime as > >> defined by the DRM scheduler and DMA-buf framework. > >> > >> Luben is currently working on cleaning all this up. > > This actually shouldn't be a problem with msm, as the fence doesn't > > change if there is a gpu reset. We simply signal the fence for the > > offending job, reset the GPU, and re-play the remaining in-flight jobs > > (ie. things that already had their job_run() called) with the original > > fences. (We don't use gpu sched's reset/timeout handling.. when I > > migrated to gpu sched I kept our existing hangcheck/recovery > > mechanism.) > > That sounds much saner than what we did. > > So you basically need the dma_fence reference counting separate to > initializing the other dma_fence fields? yeah, that was the idea > What would happen if a dma_fence which is not completely initialized > gets freed? E.g. because of an error? hmm, yes, this would be a problem since ops->release is not set yet.. and I'm relying on that to free the submit > Would it be to much to just keep the handling as it is today and only > allocate the dma_fence without initializing it? If necessary we could > easily add a dma_fence_is_initialized() function which checks the > fence_ops for NULL. Yeah, that would also be possible I guess we could split creation of the fence (initializing ops, refcount) and "arming" it later when the seqno is known? But maybe that is going to too many lengths to avoid a separate allocation.. BR, -R > > Thanks, > Christian. > > > > > BR, > > -R > > > >> Regards, > >> Christian. > >> > >>> Signed-off-by: Rob Clark > >>> --- > >>> Note that this applies on top of > >>> https://patchwork.freedesktop.org/series/93035/ > >>> out of convenience for myself, but I can re-work it to go before > >>> depending on the order that things land. > >>> > >>>drivers/gpu/drm/msm/msm_fence.c | 45 +++- > >>>drivers/gpu/drm/msm/msm_fence.h | 2 +- > >>>drivers/gpu/drm/msm/msm_gem.h| 10 +++ > >>>drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- > >>>drivers/gpu/drm/msm/msm_gpu.c| 4 +-- > >>>drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- > >>>6 files changed, 31 insertions(+), 42 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/msm/msm_fence.c > >>> b/drivers/gpu/drm/msm/msm_fence.c > >>> index 51b461f32103..51f9f1f0cb66 100644 > >>> --- a/drivers/gpu/drm/msm/msm_fence.c > >>> +++ b/drivers/gpu/drm/msm/msm_fence.c > >>> @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context > >>> *fctx, uint32_t fence) > >>>spin_unlock_irqrestore(&fctx->spinlock, flags); > >>>} > >>> > >>> -struct msm_fence { > >>> - struct dma_fence base; > >>> - struct msm_fence_context *fctx; > >>> -}; > >>> - > >>> -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) > >>> +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence > >>> *fence) > >>>{ > >>> - return container_of(fence, struct msm_fence, base); > >>> + return container_of(fence, struct msm_gem_submit, hw_fence); > >>>} > >>> > >>>static const char *msm_fence_get_driver_name(struct dma_fence *fence) > >>> @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct > >>> dma_fence *fence) > >>> > >>>static const char *msm_fence_get_timeline_name(struct dma_fence *fence) > >>>{ > >>> - struct msm_fence *f = to_msm_fence(fence); > >>> - return f->fctx->name; > >>> + struct msm_gem_submit *submit = fence_to_submit(fence); > >>> + return submit->ring->fctx->name; > >>>} > >>> > >>>static bool msm_fence_signaled(struct dma_fence *fence) > >>>{ > >
Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
Am 13.03.23 um 15:45 schrieb Rob Clark: On Mon, Mar 13, 2023 at 12:19 AM Christian König wrote: Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Well this is a very very bad idea, we made the same mistake with amdgpu as well. It's true that you should not have any memory allocation in your run_job callback, but you could also just allocate the hw fence during job creation and initializing it later on. I've suggested to embed the fence into the job for amdgpu because some people insisted of re-submitting jobs during timeout and GPU reset. This turned into a nightmare with tons of bug fixes on top of bug fixes on top of bug fixes because it messes up the job and fence lifetime as defined by the DRM scheduler and DMA-buf framework. Luben is currently working on cleaning all this up. This actually shouldn't be a problem with msm, as the fence doesn't change if there is a gpu reset. We simply signal the fence for the offending job, reset the GPU, and re-play the remaining in-flight jobs (ie. things that already had their job_run() called) with the original fences. (We don't use gpu sched's reset/timeout handling.. when I migrated to gpu sched I kept our existing hangcheck/recovery mechanism.) That sounds much saner than what we did. So you basically need the dma_fence reference counting separate to initializing the other dma_fence fields? What would happen if a dma_fence which is not completely initialized gets freed? E.g. because of an error? Would it be to much to just keep the handling as it is today and only allocate the dma_fence without initializing it? If necessary we could easily add a dma_fence_is_initialized() function which checks the fence_ops for NULL. Thanks, Christian. BR, -R Regards, Christian. Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(&fctx->spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit *submit = fence_to_submit(fence); + return msm_fence_completed(submit->ring->fctx, fence->seqno); } static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { - struct msm_fence *f = to_msm_fence(fence); - struct msm_fence_context *fctx = f->fctx; + struct msm_gem_submit *submit = fence_to_submit(fence); + struct msm_fence_context *fctx = submit->ring->fctx; unsigned long flags; ktime_t now; @@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) spin_unlock_irqrestore(&fctx->spinlock, flags); } +static void msm_fence_release(struct dma_fence *fence) +{ + __msm_gem_submit_destroy(fence_to_submit(fence)); +} + static const struct dma_fence_ops msm_fence_ops = { .get_driver_
Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
On Mon, Mar 13, 2023 at 12:19 AM Christian König wrote: > > Am 11.03.23 um 18:35 schrieb Rob Clark: > > From: Rob Clark > > > > Avoid allocating memory in job_run() by embedding the fence in the > > submit object. Since msm gpu fences are always 1:1 with msm_gem_submit > > we can just use the fence's refcnt to track the submit. And since we > > can get the fence ctx from the submit we can just drop the msm_fence > > struct altogether. This uses the new dma_fence_init_noref() to deal > > with the fact that the fence's refcnt is initialized when the submit is > > created, long before job_run(). > > Well this is a very very bad idea, we made the same mistake with amdgpu > as well. > > It's true that you should not have any memory allocation in your run_job > callback, but you could also just allocate the hw fence during job > creation and initializing it later on. > > I've suggested to embed the fence into the job for amdgpu because some > people insisted of re-submitting jobs during timeout and GPU reset. This > turned into a nightmare with tons of bug fixes on top of bug fixes on > top of bug fixes because it messes up the job and fence lifetime as > defined by the DRM scheduler and DMA-buf framework. > > Luben is currently working on cleaning all this up. This actually shouldn't be a problem with msm, as the fence doesn't change if there is a gpu reset. We simply signal the fence for the offending job, reset the GPU, and re-play the remaining in-flight jobs (ie. things that already had their job_run() called) with the original fences. (We don't use gpu sched's reset/timeout handling.. when I migrated to gpu sched I kept our existing hangcheck/recovery mechanism.) BR, -R > Regards, > Christian. > > > > > Signed-off-by: Rob Clark > > --- > > Note that this applies on top of > > https://patchwork.freedesktop.org/series/93035/ > > out of convenience for myself, but I can re-work it to go before > > depending on the order that things land. > > > > drivers/gpu/drm/msm/msm_fence.c | 45 +++- > > drivers/gpu/drm/msm/msm_fence.h | 2 +- > > drivers/gpu/drm/msm/msm_gem.h| 10 +++ > > drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- > > drivers/gpu/drm/msm/msm_gpu.c| 4 +-- > > drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- > > 6 files changed, 31 insertions(+), 42 deletions(-) > > > > diff --git a/drivers/gpu/drm/msm/msm_fence.c > > b/drivers/gpu/drm/msm/msm_fence.c > > index 51b461f32103..51f9f1f0cb66 100644 > > --- a/drivers/gpu/drm/msm/msm_fence.c > > +++ b/drivers/gpu/drm/msm/msm_fence.c > > @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, > > uint32_t fence) > > spin_unlock_irqrestore(&fctx->spinlock, flags); > > } > > > > -struct msm_fence { > > - struct dma_fence base; > > - struct msm_fence_context *fctx; > > -}; > > - > > -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) > > +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence > > *fence) > > { > > - return container_of(fence, struct msm_fence, base); > > + return container_of(fence, struct msm_gem_submit, hw_fence); > > } > > > > static const char *msm_fence_get_driver_name(struct dma_fence *fence) > > @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct > > dma_fence *fence) > > > > static const char *msm_fence_get_timeline_name(struct dma_fence *fence) > > { > > - struct msm_fence *f = to_msm_fence(fence); > > - return f->fctx->name; > > + struct msm_gem_submit *submit = fence_to_submit(fence); > > + return submit->ring->fctx->name; > > } > > > > static bool msm_fence_signaled(struct dma_fence *fence) > > { > > - struct msm_fence *f = to_msm_fence(fence); > > - return msm_fence_completed(f->fctx, f->base.seqno); > > + struct msm_gem_submit *submit = fence_to_submit(fence); > > + return msm_fence_completed(submit->ring->fctx, fence->seqno); > > } > > > > static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t > > deadline) > > { > > - struct msm_fence *f = to_msm_fence(fence); > > - struct msm_fence_context *fctx = f->fctx; > > + struct msm_gem_submit *submit = fence_to_submit(fence); > > + struct msm_fence_context *fctx = submit->ring->fctx; > > unsigned long flags; > > ktime_t now; > > > > @@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence > > *fence, ktime_t deadline) > > spin_unlock_irqrestore(&fctx->spinlock, flags); > > } > > > > +static void msm_fence_release(struct dma_fence *fence) > > +{ > > + __msm_gem_submit_destroy(fence_to_submit(fence)); > > +} > > + > > static const struct dma_fence_ops msm_fence_ops = { > > .get_driver_name = msm_fence_get_driver_name, > > .get_timeline_name = msm_fence_get_timeline_name, > > .signaled = msm_fence_signaled, > > .set_deadline = msm_fence_set_deadline, > > + .rel
Re: [PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
Am 11.03.23 um 18:35 schrieb Rob Clark: From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Well this is a very very bad idea, we made the same mistake with amdgpu as well. It's true that you should not have any memory allocation in your run_job callback, but you could also just allocate the hw fence during job creation and initializing it later on. I've suggested to embed the fence into the job for amdgpu because some people insisted of re-submitting jobs during timeout and GPU reset. This turned into a nightmare with tons of bug fixes on top of bug fixes on top of bug fixes because it messes up the job and fence lifetime as defined by the DRM scheduler and DMA-buf framework. Luben is currently working on cleaning all this up. Regards, Christian. Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(&fctx->spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit *submit = fence_to_submit(fence); + return msm_fence_completed(submit->ring->fctx, fence->seqno); } static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { - struct msm_fence *f = to_msm_fence(fence); - struct msm_fence_context *fctx = f->fctx; + struct msm_gem_submit *submit = fence_to_submit(fence); + struct msm_fence_context *fctx = submit->ring->fctx; unsigned long flags; ktime_t now; @@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) spin_unlock_irqrestore(&fctx->spinlock, flags); } +static void msm_fence_release(struct dma_fence *fence) +{ + __msm_gem_submit_destroy(fence_to_submit(fence)); +} + static const struct dma_fence_ops msm_fence_ops = { .get_driver_name = msm_fence_get_driver_name, .get_timeline_name = msm_fence_get_timeline_name, .signaled = msm_fence_signaled, .set_deadline = msm_fence_set_deadline, + .release = msm_fence_release, }; -struct dma_fence * -msm_fence_alloc(struct msm_fence_context *fctx) +void +msm_fence_init(struct msm_fence_context *fctx, struct dma_fence *f) { - struct msm_fence *f; - - f = kzalloc(sizeof(*f), GFP_KERNEL); - if (!f) - return ERR_PTR(-ENOMEM); - - f->fctx = fctx; - - dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock, - fctx->context, ++fctx->last_fence); - - return &f->base; + dma_fence_init_noref(f, &msm_fence_ops, &fctx->spinlock, +fctx->context, ++fctx->last_fence); } diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h index cdaebfb94f5c..8fca37e9773b 100644 --- a/drivers/gpu/drm/msm/msm_fence.h +++ b/drivers/g
[PATCH 2/2] drm/msm: Embed the hw_fence in msm_gem_submit
From: Rob Clark Avoid allocating memory in job_run() by embedding the fence in the submit object. Since msm gpu fences are always 1:1 with msm_gem_submit we can just use the fence's refcnt to track the submit. And since we can get the fence ctx from the submit we can just drop the msm_fence struct altogether. This uses the new dma_fence_init_noref() to deal with the fact that the fence's refcnt is initialized when the submit is created, long before job_run(). Signed-off-by: Rob Clark --- Note that this applies on top of https://patchwork.freedesktop.org/series/93035/ out of convenience for myself, but I can re-work it to go before depending on the order that things land. drivers/gpu/drm/msm/msm_fence.c | 45 +++- drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h| 10 +++ drivers/gpu/drm/msm/msm_gem_submit.c | 8 ++--- drivers/gpu/drm/msm/msm_gpu.c| 4 +-- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-- 6 files changed, 31 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c index 51b461f32103..51f9f1f0cb66 100644 --- a/drivers/gpu/drm/msm/msm_fence.c +++ b/drivers/gpu/drm/msm/msm_fence.c @@ -103,14 +103,9 @@ void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence) spin_unlock_irqrestore(&fctx->spinlock, flags); } -struct msm_fence { - struct dma_fence base; - struct msm_fence_context *fctx; -}; - -static inline struct msm_fence *to_msm_fence(struct dma_fence *fence) +static inline struct msm_gem_submit *fence_to_submit(struct dma_fence *fence) { - return container_of(fence, struct msm_fence, base); + return container_of(fence, struct msm_gem_submit, hw_fence); } static const char *msm_fence_get_driver_name(struct dma_fence *fence) @@ -120,20 +115,20 @@ static const char *msm_fence_get_driver_name(struct dma_fence *fence) static const char *msm_fence_get_timeline_name(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return f->fctx->name; + struct msm_gem_submit *submit = fence_to_submit(fence); + return submit->ring->fctx->name; } static bool msm_fence_signaled(struct dma_fence *fence) { - struct msm_fence *f = to_msm_fence(fence); - return msm_fence_completed(f->fctx, f->base.seqno); + struct msm_gem_submit *submit = fence_to_submit(fence); + return msm_fence_completed(submit->ring->fctx, fence->seqno); } static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) { - struct msm_fence *f = to_msm_fence(fence); - struct msm_fence_context *fctx = f->fctx; + struct msm_gem_submit *submit = fence_to_submit(fence); + struct msm_fence_context *fctx = submit->ring->fctx; unsigned long flags; ktime_t now; @@ -165,26 +160,22 @@ static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) spin_unlock_irqrestore(&fctx->spinlock, flags); } +static void msm_fence_release(struct dma_fence *fence) +{ + __msm_gem_submit_destroy(fence_to_submit(fence)); +} + static const struct dma_fence_ops msm_fence_ops = { .get_driver_name = msm_fence_get_driver_name, .get_timeline_name = msm_fence_get_timeline_name, .signaled = msm_fence_signaled, .set_deadline = msm_fence_set_deadline, + .release = msm_fence_release, }; -struct dma_fence * -msm_fence_alloc(struct msm_fence_context *fctx) +void +msm_fence_init(struct msm_fence_context *fctx, struct dma_fence *f) { - struct msm_fence *f; - - f = kzalloc(sizeof(*f), GFP_KERNEL); - if (!f) - return ERR_PTR(-ENOMEM); - - f->fctx = fctx; - - dma_fence_init(&f->base, &msm_fence_ops, &fctx->spinlock, - fctx->context, ++fctx->last_fence); - - return &f->base; + dma_fence_init_noref(f, &msm_fence_ops, &fctx->spinlock, +fctx->context, ++fctx->last_fence); } diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h index cdaebfb94f5c..8fca37e9773b 100644 --- a/drivers/gpu/drm/msm/msm_fence.h +++ b/drivers/gpu/drm/msm/msm_fence.h @@ -81,7 +81,7 @@ void msm_fence_context_free(struct msm_fence_context *fctx); bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence); void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence); -struct dma_fence * msm_fence_alloc(struct msm_fence_context *fctx); +void msm_fence_init(struct msm_fence_context *fctx, struct dma_fence *f); static inline bool fence_before(uint32_t a, uint32_t b) diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index c4844cf3a585..e06afed99d5b 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -259,10 +259,10 @@ struct msm_gem_submit { struct ww_acquire_ctx ticket; uint32_t seqno; /* Sequ