On 20/05/2020 12:05, Dinghao Liu wrote:
pm_runtime_get_sync() increments the runtime PM usage counter even
the call returns an error code. Thus a pairing decrement is needed
on the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu <dinghao....@zju.edu.cn>

Actually I think we have the opposite problem. To be honest we don't handle this situation very well. By the time panfrost_job_hw_submit() is called the job has already been added to the pfdev->jobs array, so it's considered submitted even if it never actually lands on the hardware. So in the case of this function bailing out early we will then (eventually) hit a timeout and trigger a GPU reset.

panfrost_job_timedout() iterates through the pfdev->jobs array and calls pm_runtime_put_noidle() for each job it finds. So there's no inbalance here that I can see.

Have you actually observed the situation where pm_runtime_get_sync() returns a failure?

HOWEVER, it appears that by bailing out early the call to panfrost_devfreq_record_busy() is never made, which as far as I can see means that there may be an extra call to panfrost_devfreq_record_idle() when the jobs have timed out. Which could underflow the counter.

But equally looking at panfrost_job_timedout(), we only call panfrost_devfreq_record_idle() *once* even though multiple jobs might be processed.

There's a completely untested patch below which in theory should fix that...

Steve

----8<---
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 7914b1570841..f9519afca29d 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -145,6 +145,8 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
        u64 jc_head = job->jc;
        int ret;

+       panfrost_devfreq_record_busy(pfdev);
+
        ret = pm_runtime_get_sync(pfdev->dev);
        if (ret < 0)
                return;
@@ -155,7 +157,6 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
        }

        cfg = panfrost_mmu_as_get(pfdev, &job->file_priv->mmu);
-       panfrost_devfreq_record_busy(pfdev);

        job_write(pfdev, JS_HEAD_NEXT_LO(js), jc_head & 0xFFFFFFFF);
        job_write(pfdev, JS_HEAD_NEXT_HI(js), jc_head >> 32);
@@ -410,12 +411,12 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job)
        for (i = 0; i < NUM_JOB_SLOTS; i++) {
                if (pfdev->jobs[i]) {
                        pm_runtime_put_noidle(pfdev->dev);
+                       panfrost_devfreq_record_idle(pfdev);
                        pfdev->jobs[i] = NULL;
                }
        }
        spin_unlock_irqrestore(&pfdev->js->job_lock, flags);

-       panfrost_devfreq_record_idle(pfdev);
        panfrost_device_reset(pfdev);

        for (i = 0; i < NUM_JOB_SLOTS; i++)
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to