[Qemu-devel] [big lock] Discussion about the convention of device's DMA each other after breaking down biglock

liu ping fan Tue, 18 Sep 2012 20:02:45 -0700

Currently, cpu_physical_memory_rw() can be used directly or indirectly
by mmio-dispatcher to access other devices' memory region. This can
cause some problem when adopting device's private lock.


Back ground refer to:
http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01481.html
For lazy, just refer to:
http://lists.gnu.org/archive/html/qemu-devel/2012-09/msg01878.html


--1st. the recursive lock of biglock.
If we leave c_p_m_rw() as it is, ie, no lock inside. Then we can have
the following (section of the whole call chain, and with
private_lockA):
      lockA-mmio-dispatcher   --> hold biglock -- >c_p_m_rw() --- >
Before c_p_m_rw(), we drop private_lockA to anti the possibly of
deadlock.  But we can not anti the nested of this chain or calling to
another lockB-mmio-dispatcher. So we can not avoid the possibility of
nested lock of biglock.  And another important factor is that we break
the lock sequence: private_lock-->biglock.
All of these require us to push biglock's holding into c_p_m_rw(), the
wrapper can not give help.

--2nd. c_p_m_rw(), sync or async?

IF we convert all of the device to be protected by refcount, then we can have
//no big lock
 c_p_m_rw()
{
   devB->ref++;
   {
--------------------------------------->pushed onto another thread.
   lock_privatelock
   mr->ops->write();
   unlock_privatelock
   }
   wait_for_completion();
   devB->ref--;
}
This model can help c_p_m_rw() present as a SYNC API.  But currently,
we mix biglock and private lock together, and wait_for_completion()
maybe block the release of big lock, which finally causes deadlock. So
we can not simply rely on this model.
Instead, we need to classify the calling scene into three cases:
  case1. lockA--dispatcher ---> lockB-dispatcher   //can use
async+completion model
  case2. lockA--dispatcher ---> biglock-dispatcher // sync, but can
cause the nested lock of biglock
  case3. biglock-dispacher ---> lockB-dispatcher  // async to avoid
the lock sequence problem, (as to completion, it need to be placed
outside the top level biglock, and it is hard to do so. Suggest to
change to case 1. Or at present, just leave it async)

This new model will require the biglock can be nested.

Any comments?

Thanks and regards,
pingfan

[Qemu-devel] [big lock] Discussion about the convention of device's DMA each other after breaking down biglock

Reply via email to