On Wed, Apr 26, 2017 at 4:15 AM, Jacek Anaszewski
<jacek.anaszew...@gmail.com> wrote:
> Hi Alexandre,
>
> Thanks for the patch.
>
> On 04/25/2017 08:19 AM, Alexandre Courbot wrote:
>> v4l2_m2m_job_finish(), which is called from the interrupt handler with
>> slock acquired, can call the device_run() hook immediately if another
>> context was in the queue. This hook also acquires slock, resulting in
>> a deadlock for this scenario.
>>
>> Fix this by releasing slock right before calling v4l2_m2m_job_finish().
>> This is safe to do as the state of the hardware cannot change before
>> v4l2_m2m_job_finish() is called anyway.
>>
>> Signed-off-by: Alexandre Courbot <acour...@chromium.org>
>> ---
>>  drivers/media/platform/s5p-jpeg/jpeg-core.c | 12 +++++++++---
>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/media/platform/s5p-jpeg/jpeg-core.c 
>> b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>> index 52dc7941db65..223b4379929e 100644
>> --- a/drivers/media/platform/s5p-jpeg/jpeg-core.c
>> +++ b/drivers/media/platform/s5p-jpeg/jpeg-core.c
>> @@ -2642,13 +2642,13 @@ static irqreturn_t s5p_jpeg_irq(int irq, void 
>> *dev_id)
>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>       v4l2_m2m_buf_done(dst_buf, state);
>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>
>>       curr_ctx->subsampling = s5p_jpeg_get_subsampling_mode(jpeg->regs);
>>       spin_unlock(&jpeg->slock);
>>
>>       s5p_jpeg_clear_int(jpeg->regs);
>>
>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>       return IRQ_HANDLED;
>>  }
>>
>> @@ -2707,11 +2707,12 @@ static irqreturn_t exynos4_jpeg_irq(int irq, void 
>> *priv)
>>               v4l2_m2m_buf_done(dst_vb, VB2_BUF_STATE_ERROR);
>>       }
>>
>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>       if (jpeg->variant->version == SJPEG_EXYNOS4)
>>               curr_ctx->subsampling = exynos4_jpeg_get_frame_fmt(jpeg->regs);
>>
>>       spin_unlock(&jpeg->slock);
>> +
>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>       return IRQ_HANDLED;
>>  }
>>
>> @@ -2770,10 +2771,15 @@ static irqreturn_t exynos3250_jpeg_irq(int irq, void 
>> *dev_id)
>>       if (curr_ctx->mode == S5P_JPEG_ENCODE)
>>               vb2_set_plane_payload(&dst_buf->vb2_buf, 0, payload_size);
>>       v4l2_m2m_buf_done(dst_buf, state);
>> -     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>>
>>       curr_ctx->subsampling =
>>                       exynos3250_jpeg_get_subsampling_mode(jpeg->regs);
>> +
>> +     spin_unlock(&jpeg->slock);
>> +
>> +     v4l2_m2m_job_finish(jpeg->m2m_dev, curr_ctx->fh.m2m_ctx);
>> +     return IRQ_HANDLED;
>> +
>>  exit_unlock:
>>       spin_unlock(&jpeg->slock);
>>       return IRQ_HANDLED;
>>
>
> Acked-by: Jacek Anaszewski <jacek.anaszew...@gmail.com>
>
> Just out of curiosity - could you share how you discovered the problem -
> by some static checkers or trying to use the driver?

We discovered this issue after adding a new unit test for the jpeg
codec in Chromium OS:

https://bugs.chromium.org/p/chromium/issues/detail?id=705971

>From what I understand the test spawns different processes that access
the codec device concurrently, creating the situation leading to the
bug.

On a slightly related note, I was thinking whether it would make sense
to move the call to  v4l2_m2m_job_finish() (and maybe other parts of
the current interrupt handler) into a worker or a threaded interrupt
handler so as to reduce the time we spend with interrupts disabled.
Can I have your input on this idea?

Reply via email to