On 03/02/2017 11:07 AM, Paolo Valente wrote:
> 
>> Il giorno 02 mar 2017, alle ore 17:13, Jens Axboe <ax...@fb.com> ha scritto:
>>
>> On 03/02/2017 09:07 AM, Paolo Valente wrote:
>>>
>>>> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <ax...@fb.com> ha 
>>>> scritto:
>>>>
>>>> On 03/02/2017 08:00 AM, Jens Axboe wrote:
>>>>> On 03/02/2017 03:28 AM, Paolo Valente wrote:
>>>>>>
>>>>>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <ax...@fb.com> ha 
>>>>>>> scritto:
>>>>>>>
>>>>>>> On 02/15/2017 10:58 AM, Jens Axboe wrote:
>>>>>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote:
>>>>>>>>>
>>>>>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval 
>>>>>>>>>> <osan...@osandov.com> ha scritto:
>>>>>>>>>>
>>>>>>>>>> From: Omar Sandoval <osan...@fb.com>
>>>>>>>>>>
>>>>>>>>>> None of the other blk-mq elevator hooks are called with this lock 
>>>>>>>>>> held.
>>>>>>>>>> Additionally, it can lead to circular locking dependencies between
>>>>>>>>>> queue_lock and the private scheduler lock.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Omar,
>>>>>>>>> I'm sorry but it seems that a new potential deadlock has showed up.
>>>>>>>>> See lockdep splat below.
>>>>>>>>>
>>>>>>>>> I've tried to think about different solutions than turning back to
>>>>>>>>> deferring the body of exit_icq, but at no avail.
>>>>>>>>
>>>>>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since 
>>>>>>>> the
>>>>>>>> core has no notion of you bfqd->lock, the naturally dependency here
>>>>>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for
>>>>>>>> you?
>>>>>>>>
>>>>>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of
>>>>>>>> holding the queue lock for that spot. For the mq scheduler, we really
>>>>>>>> don't want places where we invoke with that held already. Does the 
>>>>>>>> below
>>>>>>>> work for you?
>>>>>>>
>>>>>>> Would need to remove one more lockdep assert. And only test this for
>>>>>>> the mq parts, we'd need to spread a bit of love on the classic
>>>>>>> scheduling icq exit path for this to work on that side.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Jens,
>>>>>> here is the reply I anticipated in my previous email: after rebasing
>>>>>> against master, I'm getting again the deadlock that this patch of
>>>>>> yours solved (together with some changes in bfq-mq).  I thought you 
>>>>>> added a
>>>>>> sort of equivalent commit (now) to the mainline branch.  Am I wrong?
>>>>>
>>>>> The patch I posted was never pulled to completion, it wasn't clear
>>>>> to me if it fixed your issue or not. Maybe I missed a reply on
>>>>> that?
>>>>>
>>>>> Let me take another stab at it today, I'll send you a version to test
>>>>> on top of my for-linus branch.
>>>>
>>>> I took at look at my last posting, and I think it was only missing a
>>>> lock grab in cfq, since we don't hold that for icq exit anymore.  Paolo,
>>>> it'd be great if you could verify that this works. Not for bfq-mq (we
>>>> already know it does), but for CFQ.
>>>
>>> No luck, sorry :(
>>>
>>> It looks like to have just not to take q->queue_lock in cfq_exit_icq.
>>
>> I was worried about that. How about the below? We need to grab the lock
>> at some point for legacy scheduling, but the ordering should be correct.
>>
>>
> 
> It seems to be: no more crashes or lockdep complains after several tests, 
> boots and shutdowns :)

Great! Can I add your Tested-by?

-- 
Jens Axboe

Reply via email to