Re: [Qemu-devel] [big lock] Discussion about the convention of device's DMA each other after breaking down biglock

liu ping fan Wed, 19 Sep 2012 02:00:58 -0700

On Wed, Sep 19, 2012 at 4:06 PM, Avi Kivity <a...@redhat.com> wrote:
> On 09/19/2012 06:02 AM, liu ping fan wrote:
>> Currently, cpu_physical_memory_rw() can be used directly or indirectly
>> by mmio-dispatcher to access other devices' memory region. This can
>> cause some problem when adopting device's private lock.
>>
>> Back ground refer to:
>> http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01481.html
>> For lazy, just refer to:
>> http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01878.html
>>
>>
>> --1st. the recursive lock of biglock.
>> If we leave c_p_m_rw() as it is, ie, no lock inside. Then we can have
>> the following (section of the whole call chain, and with
>> private_lockA):
>>       lockA-mmio-dispatcher   --> hold biglock -- >c_p_m_rw() --- >
>> Before c_p_m_rw(), we drop private_lockA to anti the possibly of
>> deadlock.  But we can not anti the nested of this chain or calling to
>> another lockB-mmio-dispatcher. So we can not avoid the possibility of
>> nested lock of biglock.  And another important factor is that we break
>> the lock sequence: private_lock-->biglock.
>> All of these require us to push biglock's holding into c_p_m_rw(), the
>> wrapper can not give help.
>
> I agree that this is unavoidable.
>
>>
>> --2nd. c_p_m_rw(), sync or async?
>>
>> IF we convert all of the device to be protected by refcount, then we can have
>> //no big lock
>>  c_p_m_rw()
>> {
>>    devB->ref++;
>>    {
>> --------------------------------------->pushed onto another thread.
>>    lock_privatelock
>>    mr->ops->write();
>>    unlock_privatelock
>>    }
>>    wait_for_completion();
>>    devB->ref--;
>> }
>> This model can help c_p_m_rw() present as a SYNC API.  But currently,
>> we mix biglock and private lock together, and wait_for_completion()
>> maybe block the release of big lock, which finally causes deadlock. So
>> we can not simply rely on this model.
>> Instead, we need to classify the calling scene into three cases:
>>   case1. lockA--dispatcher ---> lockB-dispatcher   //can use
>> async+completion model
>>   case2. lockA--dispatcher ---> biglock-dispatcher // sync, but can
>> cause the nested lock of biglock
>>   case3. biglock-dispacher ---> lockB-dispatcher  // async to avoid
>> the lock sequence problem, (as to completion, it need to be placed
>> outside the top level biglock, and it is hard to do so. Suggest to
>> change to case 1. Or at present, just leave it async)
>>
>> This new model will require the biglock can be nested.
>
> I think changing to an async model is too complicated.  It's difficult
> enough already.  Isn't dropping private locks + recursive big locks
> sufficient?
>
I think that "dropping private locks + recursive big locks" just cover
case 2. And most of the important, it dont describe case3 which break
the rule of lock sequence "private-lock --> biglock". Scene:
devA_lock-->(devX_with-biglock--->devB_lock).
I just want to classify and post these cases to discuss. Maybe we can
achieve without async.


Thanks and regards,
pingfan

> --
> error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [big lock] Discussion about the convention of device's DMA each other after breaking down biglock

Reply via email to